April 16, 2014

avatar

Wikipedia vs. Britannica Smackdown

On Friday I wrote about my spot-check of the accuracy of Wikipedia, in which I checked Wikipedia’s entries for six topics I knew well. I was generally impressed, except for one entry that went badly wrong.

Adam Shostack pointed out, correctly, that I had left the job half done, and I needed to compare to the entries for the same six topics in a traditional encyclopedia. So here’s my Wikipedia vs. Britannica Online comparison, for the six topics I wrote about on Friday.

Princeton University: Both entries are accurate and reasonably well written. Wikipedia has more information. Verdict: small advantage to Wikipedia.

Princeton Township: Britannica has a single entry for Princeton Township and Princeton Boro, while Wikipedia has separate entries. Both entries are good, but Wikipedia has more information (including demographics). Also, Britannica makes an error in saying that Morven is the Governor’s Residence for the state of New Jersey; Wikipedia correctly labels Morven as the former Governor’s Residence and Drumthwacket as the current one. Verdict: advantage to Wikipedia.

Me: Wikipedia has a short but decent entry; Britannica, unsurprisingly, has nothing. Verdict: advantage Wikipedia.

Virtual memory: Wikipedia has a pretty good entry; Britannica has no entry for virtual memory, and doesn’t appear to discuss the concept elsewhere, either. Verdict: advantage Wikipedia.

Public-key cryptography: Good, accurate entries in both. Verdict: toss-up.

Microsoft antitrust case: Britannica has only two sentences, saying that Judge Jackson ruled against Microsoft and ordered a breakup, and that the Court of Appeals overturned the breakup but agreed that Microsoft had broken the law. That’s correct, but it leaves out the settlement. Wikipedia’s entry is much longer but error-prone. Verdict: big advantage to Britannica.

Overall verdict: Wikipedia’s advantage is in having more, longer, and more current entries. If it weren’t for the Microsoft-case entry, Wikipedia would have been the winner hands down. Britannica’s advantage is in having lower variance in the quality of its entries.

Comments

  1. Scott Preece says:

    While I can’t speak to the goals of the current owners/editors of the Britannica, I can say that in the past, the “little” articles like the ones you compared were not really the heart of Britannica’s content. The “crown jewels” were the big articles (Macropaedia rather than Micropaedia, in the print editions) that covered broad subject areas (like “mathematics”) rather than narrow coverage of individual facts.

    Before the major revision of the Britannica in the earlu 1970s, they didn’t do “dictionary”-style entries at all – there were major articles and there was an index. The revision added the small articles, in the hope of broadening the applicability of the set (World Book had grown its market by convincing parents that answering simple, narrow questions was the way to evaluate an encyclopedia).

    I haven’t spent a lot of time looking at the wikipedia, but I would expect that it would do better at the smaller, fact-centric articles than at broader, “put-it-all-together” articles.

  2. asdf says:

    If you’re going to compare any 2 entities, shouldn’t you also take the age of these entities into account. If you were to do this, I think it would be elementarily self-evident that
    Wikipedia is a going to outstrip Britannica over the next several years. Why do I say this? Like it said, it’s self evident: Because of Wikipedia exponential growth rate for one. And secondly because of the pace of that growth rate. Think of Britannica in it’s early days and you will fall severely short of imagining a rate of contribution comparable to the rate of input that Wikipedia has to live with today. Then think of Britannica today and you think of it’s cycle of update and it’s all laid out to dry. Wikipedia rules. It’s a revolution and it’s going to outstrip all other encyclopaedic forms soon.

  3. Pete says:

    It would be nice to see an actual scientific survey of this issue. Wikipedia is obviously going to be a lot better at covering hot topics and computer-related issues, but what about other subjects? How does it stack up on detail in the humanities? A linguist friend of mine claims it is fairly poor in his area.

    asdf: Quantity does not automatically lead to quality…

  4. Ross Mayfield says:
  5. Eric Burns says:

    I think there’s a certain invalidity in the above. I’m not biased against Wikipedia, mind, but of the six entries being compared, the last four specifically play against the strengths of the Britannica and to the strengths of Wikipedia. Few individuals are going to show up in the Britannica unless they have had a major impact outside of the insular internet community, for example, and PGP Cryptography, Virtual Memory and the Microsoft Antitrust case are all of particular interest to the Internet community that compiles Wikipedia. It is not unlike comparing the Catholic Encyclopedia with Britannica, and choosing a statistical universe that is 80% theological as its basis.

    As a further test, I decided to compare the specific entries for the small town I grew up in — Fort Kent, Maine, which is of no particular interest to the Internet at large. Wikipedia’s entry is extremely dry and mostly consists of canned demographic information, noting in a cursory fashion that U.S. 1 has its Northern Terminus there. The Britannica entry isn’t long, but touches on Fort Kent’s settlement history, its agricultural nature and principle products, the University of Maine at Fort Kent, and U.S. 1′s southern terminus in Key West. Of the two, the significant advantage goes to Britannica.

    Wikipedia is growing and, I believe, will one day eclipse Britannica. However, in comparing the two today, one must be careful to not focus on the natural strengths of one over the other.

  6. Emergent Chaos says:

    Wikipedia vs Britannica tested

    In Wikipedia vs. Britannica Smackdown, Ed Felten takes my challenge. In the meanwhile, I’d done some hypothesizing, here. So how’d I do? Hypothesis 1 is spot on. #2 is more challenging to assess: The errors in Britannica are smaller, and…

  7. Pete says:

    Isn’t the primary use case for an encyclopedia to look up things you *don’t* know? I don’t know how my comfort level about Wikipedia should change knowing what I know now about these six entries (and whether an increase in comfort is warranted or false). Of course, it is common to use multiple sources to research topics, and given some discrepancy btwn these two (and I don’t know anything about the topic), I would have to pick Britannica every time (wouldn’t you?).

    Two other possible experiments:

    1) Compare Wikipedia with Google. Why try to come up with a community “authoritative source” without authority with Google around? (Even more enjoyable is Googlism – I took the libery of googlism’ing (?) one of your topics: http://www.googlism.com/index.htm?ism=ed+felten&type=1 . What say you?)

    2) Insert known false entries into Wikipedia and see how long it takes for anyone to find them. (I think this was like a Caleb Carr novel of a few years ago). Hmmm, can I put in a Wiki entry on Gerald Nwafor from Ghana, “Who was the paramount ruler of our community and also the crown king of our region” according to the email I received from him requesting my assistance? ;-)

    I am curious about the parallels of this discussion – how our perception of Wikipedia’s intended/expected use vs. potential use/abuse compares with perceptions of intended/potential use in privacy issues (say, Clipper Chip issue or Carnivore), security (perception of security/risk), and e-voting, for example.

  8. James Walden says:

    Pete, experiment #2 has been attempted a couple of times recently. Alex Halavas’ blog recounts the results of his experiment.

  9. Joe says:

    I think a simpler way of saying, “having lower variance in the quality of” is “better”

    I am a Wikipedia supporter, so I’m not ragging on their quality. I’m ragging on your lengthyspeak.

  10. Ed Felten says:

    Joe,

    I thought of writing “better”, but decided that wasn’t quite the right word. I didn’t want to imply that Britannica was better on average, only that its level of quality was fairly predictable, while Wikipedia’s quality was sometimes better and sometimes worse, but varied more.

  11. Doug Renfrew says:

    You say you spotted errors in the entry for the Microsoft antitrust case and in the entry about you. You mention fixing the errors about yourself but did you also fix the rrors in the Microsoft entry? WP will not improve without the help of educated persons like your self.

  12. Jeremy says:

    I’m unsure that as time goes on, more information will be changed and made accurate. I see the opposite, that as time goes on and more entries are made, there is less of a chance that people will edit them to correct them due to volume. Since no study has been done on the practicability of Wikipedia’s process, I’m not sure how well its editing scheme works and add that caveat to the above.

    It also creates a survival of the fittest scenario for knowledge. Only things that get read the most will have a large sample of editors, leaving less read information or usually unread information unsuitably edited.

    This all, of course, is solved by using multiple sources, but many people don’t always do that. I’m wondering if there’s a way to improve the editing process to cover those two faults.

  13. Seth Finkelstein says:

    Building on what Pete said, I suggest there is a sampling bias in your survey. Since WikiPedia is mostly (though not exclusively) a project of computer geeks, it is likely to be particularly strong in computer geek areas (OS theory, crypto). As you are a computer geek, you’ll get an impressive view.

    But, as you saw, moving away from that topic, to e.g. law, the accuracy decreases.

  14. PW Herman says:

    If the Wikipedia article was wrong, why didn’t you just correct it? Then it would’ve won!!!

  15. Adam Shostack says:

    Pete,

    I’m curious what you mean by scientific? I posted a set of hypothesis about what Ed would find, and I tried to make that as broad as I could. But I don’t think that the set could or did lead to a scientfic survey.

    Seth,
    I think that’s a fascinating question, and I’d love to see an experiment designed and executed to test it. Perhaps we could get a list of all queries with exactly N answers today in Google, and see what Wikipedia and Britannica have to say?

  16. Zachary Floyd says:

    I have to agree with Doug Renfrew and PW Herman … and obliquely with Adma Shostack … although I think you did less than a third of the job. The last third involves correcting the entries. Did you?

    If you had corrected the ‘errors’ you found in the Wikipedia, rather than just write about them … then there’s no doubt the Wikipedia would have won (on your analysis, anyway).

    No-one’s going to pat you on the back for correcting the Wikipedia, but doing so is far more important than measuring its level of accuracy.

    Who do you think writes and corrects these entries? If you have better information or feel you can do a better job, help us!

    You don’t even need a login. Just do it!

    (Written by a frequent, but very careful and delicate, Wikipedia ‘corrector’)

  17. Zachary Floyd says:

    I have to agree with Doug Renfrew and PW Herman … and obliquely with Adam Shostack … although I think you did less than a third of the job. The last third involves correcting the entries. Did you?

    If you had corrected the ‘errors’ you found in the Wikipedia, rather than just write about them … then there’s no doubt the Wikipedia would have won (on your analysis, anyway).

    No-one’s going to pat you on the back for correcting the Wikipedia, but doing so is far more important than measuring its level of accuracy.

    Who do you think writes and corrects these entries? If you have better information or feel you can do a better job, help us!

    You don’t even need a login. Just do it!

    (Written by a frequent, but very careful and delicate, Wikipedia ‘corrector’)

  18. Zachary Floyd says:

    Oops … sorry about the double posting.

  19. Jake says:

    Why didn’t he correct the errors? Maybe because he did not have first-source references to base his changes on. Or maybe he didn’t have time to make proper corrections (that wouldn’t immediately be reverted by the author who previously wrote it). With a long, detailed article like the one about the MS case, it would probably take a few hours work to correct even a minor mistake in an article. Do we all have that time?

  20. pb says:

    I don’t think it’s appropriate to guilt anyone into editing Wikipedia. It’s not an obligation. Editors should want to edit.

    Everything I’ve seen written so far seems to come from the angle that Britannica must be better than Wikipedia. I’d like to see someone come from the other side: that Wikipedia is stronger due to the power of a zillion editors. Someone needs to point out that an organized free market wins out over centralization every time. That errors remain in Britannica for years. That we can analogize to Windows vs. Apache/Linux. I’m a little sick of hearing about these so-called experts and editors and copy readers at Britannica. Has anyone ever met one of these people? Does anyone really know what the process is at Britannica?

  21. Brian Kendig says:

    I’d just like to remind everyone that if you find errors or bias in a Wikipedia article, yet you don’t have the time or the facts to be able to correct them, simply click ‘Discuss this page’ and add a note saying “These parts of the page are in error, and these parts of it are biased.” That way other people with more time and facts can work on fixing the problem.

    There have been no problems reported on the “Microsoft antitrust case” article’s discussion page which haven’t been resolved. If you know of a problem but you don’t tell anyone about it, you can’t expect people to read your mind.

    You say that Britannica has a “big advantage” in the antitrust case article even though it only contains two sentences on the topic. Let me ask you this: send a student to read Britannica’s coverage of the case, and send another student to read Wikipedia’s coverage; who will come away with a better understanding of it?

    Take Wikipedia, add a large staff of paid professionals who won’t release information until it’s been approved, and you’ll have Britannica. This is what makes Britannica more expensive, less comprehensive, and less frequently updated. If you’d rather not have facts unless they have the approval of someone who’s being paid to check them, then you don’t have to use Wikipedia. If you know of a way to make Wikipedia more accurate without sacrificing its strengths, then please speak up; but condemning the entire effort because you found inaccuracies in it (and continuing to berate it for not having fixed problems you didn’t specify) isn’t helping anyone.

  22. Scott Preece says:

    pb asks about Britannica’s procedures. I can’t speak to the current staffing at Britannica, but in the past it had a large staff of professional editors and solicited its major articles from key experts in specific subject areas. Major articles in Britannica are sometimes long enough to be books.

    Articles were edited by professional editors using consistent editorial style. Professional indexers built the index (anybody who tells you that free-text keyword searching is as good as indexing by a capable indexer is wrong, though free-text searching is obviously also a critical capability).

    There are good and bad things about both models, but people have, historically, been able to assume that Britannica articles were consistently accurate.

  23. Jeremy Leader says:

    I don’t agree that “less variance in quality” is always “better”. It depends what you’re looking for. If you want a single reasonably authoritative source, then Britannica is probably “better”, because you’re unlikely to get an article that’s really eggregiously wrong or biased. However, if you’re collecting information from multiple sources, then more information may be better, even if there’s a chance it’s incorrect or biased.

    One factor you didn’t take account of is Wikipedia’s “discussion” tab, which can help provide an estimate of the controversy regarding a particular article. For example, at http://en.wikipedia.org/wiki/Talk:Microsoft_antitrust_case I can see what parts of the article have provoked discussion recently, which can help indicate whether portions might be inaccurate or biased.

  24. Many-to-Many says:

    Felten on Wikipedia

    Continuing the examination of the value of the Wikipedia, Ed Felten compares Wikipedia and Britannica Overall verdict: Wikipedia’s advantage is in having more, longer, and more current entries. If it weren’t for the Microsoft-case entry, Wi…

  25. Napsterization says:

    The New Business Model Is the Old Business Model

    I’ve been thinking about why people in the top down, traditional journalism media business have so much trouble with the new distributed horizontal media ‘business’ (if you can call it that, what with all the free online media from bloggers…

  26. James Bergman says:

    Wikipedia false encyclopedia..don’t used this articles as references for your post
    I notice that you used Wikipedia as a source, that’s like using melting ice as paperholder.

    Another example of antidemocracy is this site(http://www.jeffdoolittle.com/archives/000178.php) , where the owner(like thouthands other sites) feeds his site from wikipedia and then blocks anybody‘s critics against wikipedia… behind the scheme of the Wikithing there is money and fraud. They already began asking for 50 thouthands, soon will be millions. If I go to an Encyclopedia is not to edit what you are looking for, is to find it, because the basic fact that makes you look for it is “that I don’t know”. This is the whole point of the existence of an encyclopedia. If not…please don’t commit intellectual abuse to all the millions of children and students that just write Encyclopedia on their browser and find millions of sites posting the autopromoting web site Wilkepedia, they will believe is a true encyclopedia, and just copy, they don’t have time to waste as it happens with the small and repetitive group of people of the Wikithing fraudulently calling itself Encyclopedia…they are harming others…my child got and “F” on his project, because of this….It will be better if they call themselves just a “Project” or Wikichat, or Wikiforumpedia…but is fraud…big fraud, calling youself , encyclopedia…it doesn’t matter haw many small letters warnings you put at the bottom. The other thousands of money thirsty sites on the web that post Wikipedia “never finished and always changeable articles” on the web, do not care about who reads them. An the readers, mostly students, like my child, are looking for steady and verifiable information. Rules do not change every minute in a democracy, they are there for a long time period before they are changed. That is a democracy, free changeable things are describe as, variable, aperiodic, unstable, unpredictable, and are the characteristic of CAOS, nor Democracy. Of course fanatics have a way to block information against them. In the Theory of Systems, Wikipedia is a close system by all means, far from being open as they claim. But again is healthy for the net, not to classify them as an Encyclopedia, because they are not by all definitions.
    Have a nice day..