January 7, 2025

Chinese Gold Farmers: Work or Fun?

Julian Dibbell had an interesting article in yesterday’s NYT, profiling several Chinese gold farmers, who make their living playing the massive multiplayer game World of Warcraft (WoW) and accumulating virtual loot that is ultimately sold for real money. If you’re not familiar with gold farming, or virtual-world economies in general, it’s a nice introduction.

Even if you’ve heard it all before, the article is still worthwhile as a meditation on the porous boundary between work and fun online. These guys make their living playing a game, in seven twelve-hour shifts a week. It’s highly repetitive work – they follow the loot-maximizing strategy which involves hanging around the same little area and whacking the same monsters over and over. WoW players even call this kind of play “the grind”.

Yet somehow the guys enjoy it, not all the time but often enough to find a work rewarding in an odd way. One guy, Wang Huachen, has a law degree but chooses to play/work WoW instead, at least for a while.

“I will miss this job,” [Wang] said. “It can be boring, but I still have sometimes a playful attitude. So I think I will miss this feeling.”

I turned to Wang Huachen, who remained intent on manipulating an arsenal of combat spells, and asked again how it was possible that in these circumstances anybody could, as he put it, “have sometimes a playful attitude”?

He didn’t even look up from his screen. “I cannot explain,” he said. “It just feels that way.”

Amazingly, after finishing a twelve-hour shift, some of these guys spend their long-awaited free time … playing WoW.

But all that changed when the boss of one gold farm got a new business idea: rather than grinding out more loot, his employees would instead build up a 40-man team of uber-characters who would serve as mercenaries, for hire by players who wanted reliable, non-greedy companions in attacking the toughest areas of WoW. Suddenly these gold farmers could really use their skills, and have more fun – for a while.

The end arrived without warning. One day word came down from the bosses that the 40-man raids were suspended indefinitely for lack of customers. In the meantime, team members would go back to gold farming, gathering loot in five-man dungeons that once might have thrilled Min but now presented no challenge whatsoever. “We no longer went to fight the big boss monsters,” Min said. “We were ordered to stay in one place doing the same thing again and again. Everyday I was looking at the same thing. I could not stand it.”

What’s most interesting about this, to me at least, is the relationship between the gold farmers and the players they serve. It’s not a personal relationship, only an economic one, in which the gold farmers play the boring part of the game in exchange for a cash payment from a richer customers.

This relationship is an amazing tangle of play and work. The gold farmer works playing a game, so he can earn money which he spends playing the same game. The customer finds part of the game too much like work, so he works at another job to earn money to pay a gold farmer to play for him, so the customer can have more fun when he plays. Got it?

Dutch E-Voting System Has Problems Similar to Diebold's

A team of Dutch researchers, led by Rop Gonggrijp and Willem-Jan Hengeveld, managed to acquire and analyze a Nedap/Groenendaal e-voting machine used widely in the Netherlands and Germany. They report problems strikingly similar to the ones Ari Feldman, Alex Halderman and I found in the Diebold AccuVote-TS.

The N/G machines all seem to be opened by the same key, which is easily bought on the Internet – just like the Diebold machines.

The N/G machines can be put in maintenance mode by entering the secret password “GEHEIM” (which means “SECRET” in Dutch) – just as the Diebold machines used the secret PIN “1111” to enter supervisor mode.

The N/G machines are subject to software tampering, allowing a criminal to inject code that transfers votes undetectably from one candidate to another – just like the Diebold machines.

There are some differences. The N/G machines appear to be better designed (from a security standpoint), requiring an attacker to work harder to tamper with vote records. As an example, the electrical connection between the N/G machine and its external memory card is cleverly constructed to prevent the voting machine from undetectably modifying data on the card. This means that the strategy used by our Diebold virus, which allows votes to be recorded correctly but modifies them a few seconds later, would not work. Instead, a vote-stealing program has to block votes from being recorded, and then fill in the missing votes later, with “improvements”. This doesn’t raise the barrier to vote-stealing much, but at least the machine’s designers considered the problem and did something constructive.

The Dutch paper has an interesting section on electromagnetic emanations from voting machines. Like other computers, voting machines emit radio waves that can be picked up at a distance by somebody with the right equipment. It is well known that eavesdroppers can often exploit such emanations to “see” the display screen of a computer at a distance. Applied to voting machines, such an attack might allow a criminal to learn what the voter is seeing on the screen – and hence how he is voting – from across the room, or possibly from outside the polling place. The Dutch researchers’ work on this topic is preliminary, suggesting a likely security problem but not yet establishing it for certain. Other e-voting machines are likely to have similar problems.

What is most striking here is that different e-voting machines from different companies seem to have such similar problems. Some of the technical challenges in designing an e-voting machine are very difficult or even infeasible to address; and it’s not surprising to see those problems unsolved in every machine. But to see the same simple, easily avoided weaknesses, such as the use of identical widely-available keys and weak passwords/PINs, popping up again, has to teach a deeper lesson.

The New Yorker Covers Wikipedia

Writing in this week’s New Yorker, Stacy Schiff takes a look at the Wikipedia phenomenon. One sign that she did well: The inevitable response page at Wikipedia is almost entirely positive. Schiff’s writing is typical of what makes the New Yorker great. It has rich historical context, apt portrayals of the key characters involved in the story, and a liberal sprinkling of humor, bons mots and surprising factual nuggets. It is also, as all New Yorker pieces are, rigorously fact-checked and ably edited.

Normally, I wouldn’t use FTT as a forum to “talk shop” about a piece of journalism. But in this case, the medium really is the message – the New Yorker’s coverage of Wikipedia is itself a showcase for some of the things old-line publications still do best. As soon as I saw Schiff’s article in my New Yorker table of contents (yes, I still read it in hard copy, and yes, I splurge on getting it mailed abroad to Oxford) I knew it would be a great test case. On the one hand, Wikipedia is the preeminent example of community-driven, user-generated content. Any coverage of Wikipedia, particularly any critical coverage, is guaranteed to be the target of harsh, well-informed scrutiny by the proud community of Wikipedians. On the other, The New Yorker’s writing is, indisputably, among the best out there, and its fact checking department is widely thought to be the strongest in the business.

When reading Wikipedia, one has to react to surprising claims by entertaining the possibility that they might not be true. The less plausible a claim sounds, the more skepticism one must have when considering it. In some cases, a glance at the relevant Talk page helps, since this can at least indicate whether or not the claim has been vetted by other Wikipedians. But not every surprising claim has backstory available on the relevant talk page, and not every reader has the time or inclination to go to that level of trouble for every dubious claim she encounters in Wikipedia. The upshot is that implausible or surprising claims in Wikipedia often get taken with a grain or more of salt, and not believed – and on the other hand, plausible-sounding falsehoods are, as a result of their seeming plausibility, less likely to be detected.

On the other hand, rigorous fact-checking (at least in the magazine context where I have done it and seen it) does not simply mean that someone is trying hard to get things right: It means that someone’s job depends on their being right, and it means that the particularly surprising claims in the fact-checked content in particular can be counted on to be well documented by the intense, aspiring, nervous young person at the fact checker’s desk. At TIME, for example, every single word that goes in to the magazine physically gets a check mark, on the fact-checkers’ copy, once its factual content has been verified, with the documentation of the fact’s truth filed away in an appropriate folder (the folders, in a holdover from an earlier era, are still called “carbons”). It is every bit as grueling as it sounds, and entirely worthwhile. The same system is in use across most of the Time, Inc. magazine publishing empire, which includes People, Fortune, and Sports Illustrated and represents a quarter of the U.S. consumer magazine market. It’s not perfect of course – reports of what someone said in a one-on-one interview, for example, can only ever be as good as the reporter’s notes or tape recording. But it is very, very good. In my own case, knowing what goes in to the fact-checking process at places like TIME and The New Yorker gives me a much higher level of confidence in their accuracy than I have when, as I often do, I learn something new from Wikipedia.

The guarantee of truth that backs up New Yorker copy gives its content a much deeper impact. Consider these four paragraphs from Schiff’s story:

The encyclopedic impulse dates back more than two thousand years and has rarely balked at national borders. Among the first general reference works was Emperor’s Mirror, commissioned in 220 A.D. by a Chinese emperor, for use by civil servants. The quest to catalogue all human knowledge accelerated in the eighteenth century. In the seventeen-seventies, the Germans, champions of thoroughness, began assembling a two-hundred-and-forty-two-volume masterwork. A few decades earlier, Johann Heinrich Zedler, a Leipzig bookseller, had alarmed local competitors when he solicited articles for his Universal-Lexicon. His rivals, fearing that the work would put them out of business by rendering all other books obsolete, tried unsuccessfully to sabotage the project.

It took a devious Frenchman, Pierre Bayle, to conceive of an encyclopedia composed solely of errors. After the idea failed to generate much enthusiasm among potential readers, he instead compiled a “Dictionnaire Historique et Critique,” which consisted almost entirely of footnotes, many highlighting flaws of earlier scholarship. Bayle taught readers to doubt, a lesson in subversion that Diderot and d’Alembert, the authors of the Encyclopédie (1751-80), learned well. Their thirty-five-volume work preached rationalism at the expense of church and state. The more stolid Britannica was born of cross-channel rivalry and an Anglo-Saxon passion for utility.

Wales’s first encyclopedia was the World Book, which his parents acquired after dinner one evening in 1969, from a door-to-door salesman. Wales—who resembles a young Billy Crystal with the neuroses neatly tucked in—recalls the enchantment of pasting in update stickers that cross-referenced older entries to the annual supplements. Wales’s mother and grandmother ran a private school in Huntsville, Alabama, which he attended from the age of three. He graduated from Auburn University with a degree in finance and began a Ph.D. in the subject, enrolling first at the University of Alabama and later at Indiana University. In 1994, he decided to take a job trading options in Chicago rather than write his dissertation. Four years later, he moved to San Diego, where he used his savings to found an Internet portal. Its audience was mostly men; pornography—videos and blogs—accounted for about a tenth of its revenues. Meanwhile, Wales was cogitating. In his view, misinformation, propaganda, and ignorance are responsible for many of the world’s ills. “I’m very much an Enlightenment kind of guy,” Wales told me. The promise of the Internet is free knowledge for everyone, he recalls thinking. How do we make that happen?

As an undergraduate, he had read Friedrich Hayek’s 1945 free-market manifesto, “The Use of Knowledge in Society,” which argues that a person’s knowledge is by definition partial, and that truth is established only when people pool their wisdom. Wales thought of the essay again in the nineteen-nineties, when he began reading about the open-source movement, a group of programmers who believed that software should be free and distributed in such a way that anyone could modify the code. He was particularly impressed by “The Cathedral and the Bazaar,” an essay, later expanded into a book, by Eric Raymond, one of the movement’s founders. “It opened my eyes to the possibility of mass collaboration,” Wales said.

After reading this copy, and knowing how The New Yorker works, one can be confident that a devious Frenchman named Pierre Bayle once really did propose an encyclopedia comprised entirely of errors. The narrative is put together well. It will keep people reading and will not cause confusion. Interested readers can follow up on a nugget like Wales’ exposure to the Hayek essay by reading it themselves (it’s online here).

I am not a Wikipedia denialist. It is, and will continue to be, an important and valuable resource. But the expensive, arguably old fashioned approach of The New Yorker and other magazines still delivers a level of quality I haven’t found, and do not expect to find, in the world of community-created content.

Acoustic Snooping on Typed Information

Li Zhuang, Feng Zhou, and Doug Tygar have an interesting new paper showing that if you have an audio recording of somebody typing on an ordinary computer keyboard for fifteen minutes or so, you can figure out everything they typed. The idea is that different keys tend to make slightly different sounds, and although you don’t know in advance which keys make which sounds, you can use machine learning to figure that out, assuming that the person is mostly typing English text. (Presumably it would work for other languages too.)

Asonov and Agrawal had a similar result previously, but they had to assume (unrealistically) that you started out with a recording of the person typing a known training text on the target keyboard. The new method eliminates that requirement, and so appears to be viable in practice.

The algorithm works in three basic stages. First, it isolates the sound of each individual keystroke. Second, it takes all of the recorded keystrokes and puts them into about fifty categories, where the keystrokes within each category sound very similar. Third, it uses fancy machine learning methods to recover the sequence of characters typed, under the assumption that the sequence has the statistical characteristics of English text.

The third stage is the hardest one. You start out with the keystrokes put into categories, so that the sequence of keystrokes has been reduced a sequence of category-identifiers – something like this:

35, 12, 8, 14, 17, 35, 6, 44, …

(This means that the first keystroke is in category 35, the second is in category 12, and so on. Remember that keystrokes in the same category sound alike.) At this point you assume that each key on the keyboard usually (but not always) generates a particular category, but you don’t know which key generates which category. Sometimes two keys will tend to generate the same category, so that you can’t tell them apart except by context. And some keystrokes generate a category that doesn’t seem to match the character in the original text, because the key happened to sound different that time, or because the categorization algorithm isn’t perfect, or because the typist made a mistake and typed a garbbge charaacter.

The only advantage you have is that English text has persistent regularities. For example, the two-letter sequence “th” is much more common that “rq”, and the word “the” is much more common than “xprld”. This turns out to be enough for modern machine learning methods to do the job, despite the difficulties I described in the previous paragraph. The recovered text gets about 95% of the characters right, and about 90% of the words. It’s quite readable.

[Exercise for geeky readers: Assume that there is a one-to-one mapping between characters and categories, and that each character in the (unknown) input text is translated infallibly into the corresponding category. Assume also that the input is typical English text. Given the output category-sequence, how would you recover the input text? About how long would the input have to be to make this feasible?]

If the user typed a password, that can be recovered too. Although passwords don’t have the same statistical properties as ordinary text (unless they’re chosen badly), this doesn’t pose a problem as long as the password-typing is accompanied by enough English-typing. The algorithm doesn’t always recover the exact password, but it can come up with a short list of possible passwords, and the real password is almost always on this list.

This is yet another reminder of how much computer security depends on controlling physical access to the computer. We’ve always known that anybody who can open up a computer and work on it with tools can control what it does. Results like this new one show that getting close to a machine with sensors (such as microphones, cameras, power monitors) may compromise the machine’s secrecy.

There are even some preliminary results showing that computers make slightly different noises depending on what computations they are doing, and that it might be possible to recover encryption keys if you have an audio recording of the computer doing decryption operations.

I think I’ll go shut my office door now.

Recommended Reading: The Success of Open Source

It’s easy to construct arguments that open source software can’t succeed. Why would people work for free to make something that they could get paid for? Who will do the dirty work? Who will do tech support? How can customers trust a “vendor” that is so diffuse and loosely organized?

And yet, open source has had some important successes. Apache dominates the market for web server software. Linux and its kin are serious players in the server operating system market. Linux is even a factor in the desktop OS market. How can this be reconciled with what we know about economics and sociology?

Many articles and books have been written about this puzzle. To my mind, Steven Weber’s book “The Success of Open Source” is the best. Weber explores the open source puzzle systematically, breaking it down into interesting subquestions and exploring answers. One of the book’s virtues is that it doesn’t claim to have complete answers; but it does present and dissect partial answers and hints. This is a book that could merit a full book club discussion, if people are interested.