October 20, 2017

Wikipedia Leads; Will Search Engines NoFollow?

Wikipedia has announced that all of its outgoing hyperlinks will now include the rel=”nofollow” attribute, which instructs search engines to disregard the links. Search engines infer a page’s importance by seeing who links to it – pages that get many links, especially from important sites, are deemed important and are ranked highly in search results. A link is an implied endorsement: “link love”. Adding nofollow withholds Wikipedia’s link love – and Wikipedia, being a popular site, has lots of link love to give.

Nofollow is intended as an anti-spam measure. Anybody can edit a Wikipedia page, so spammers can and do insert links to their unwanted sites, thereby leeching off the popularity of Wikipedia. Nofollow will reduce spammers’ incentives by depriving them of any link love. Or that’s the theory, at least. Bloggers tried using nofollow to attack comment spam, but it didn’t reduce spam: the spammers were still eager to put their spammy text in front of readers.

Is nofollow a good idea for Wikipedia? It depends on your general attitude toward Wikipedia. The effect of nofollow is to reduce Wikipedia’s influence on search engine rankings (to zero). If you think Wikipedia is mostly good, then you want it to have influence and you’ll dislike its use of nofollow. If you think Wikipedia is unreliable and random, then you’ll be happy to see its influence reduced.

As with regular love, it’s selfish to withhold link love. Sometimes Wikipedia links to a site that competes with it for attention. Without Wikipedia’s link love, the other site will rank lower, and it could lose traffic to Wikipedia. Whether intended or not, this is one effect of Wikipedia’s action.

There are things Wikipedia could do to restore some of its legitimate link love without helping spammers. It could add nofollow only to links that are suspect – links that are new, or were added by an user without a solid track record on the site, or that have survived several rewrites of a page, or some combination of such factors. Even a simple policy of using nofollow for the first two weeks might work well enough. Wikipedia has the data to make these kinds of distinctions, and it’s not too much to ask for a site of its importance to do the necessary programming.

But the one element missing so far in this discussion is the autonomy of the search engines. Wikipedia is asking search engines not to assign link love, but the search engines don’t have to obey. Wikipedia is big enough, and quirky enough, that the search engines’ ranking algorithms probably have Wikipedia-specific tweaks already. The search engines have surely studied whether Wikipedia’s link love is reliable enough – and if it’s not, they are surely compensating, perhaps by ignoring (or reducing the weight of) Wikipedia links, or perhaps by a rule such as ignoring links for the first few weeks.

Whether or not Wikipedia uses nofollow, the search engines are free to do whatever they think will optimize their page ranking accuracy. Wikipedia can lead, but the search engines won’t necessarily nofollow.

Comments

  1. I have a site that someone linked a couple years ago (the site is the #1 resource for this particular subject) on Wikipedia. My site is small and I appreciated the “link love” from a high-ranking site like Wikipedia. Therefore, I’m all for them figuring-out how to properly handle the nofollow situation.

    I think that your ideas for trusted editors approving links or something like that could be implemented (and should be). Whether or not they do it, though, is another story.

  2. The search engines, of course, will never disclose how they weight links from any site, and Wikipedia is certainly big enough and popular enough that it deserves special-case treatment. An interesting challenge is whether anybody can infer the actual behavior of the search engines based on the page rank that gets assigned (or not) as a result of Wikipedia links.

    (Somewhere out there, there’s a Shakespeare sonnet about love just waiting to be morphed into a discussion of nofollow.)

  3. Well, it’s not really that the search engines have to obey – formalistically, it’s more that Wikipedia is telling the search engines that it now recommends they don’t count its external links. That has “political” weight, even if it’s not an automatic directive.

  4. Tomer Chachamu says:

    I don’t think Wikipedia has the server resources and developers to reasonably distinguish old links from new ones. They are constantly strained.

  5. This is an old argument, one which the blogosphere was actively involved in and lost. They argued on the side of unimpeded spamminess. Only the english edition of all the Wikimedia Foundation projects failed to implement nofollow.

    And robot spamming plummeted like a rock.

    Such blatant huckstering (and being invited to attend, speak at a couple of blogging conferences) were some of the reasons why I and many others started trimming back the time waste surfing the blogosphere.

    You know what? I now have a subscription to the Vancouver Sun, a local dead-tree daily publication.

  6. the_zapkitty says:

    “You know what? I now have a subscription to the Vancouver Sun, a local dead-tree daily publication.”

    So, will you get the truth about issues such as, say, e-voting from the Sun?

    And how much of the truth could a dead-tree publication even hope to print and stay in business? There’s only so many column inches in a dead tree…

    It’s alright… few people can handle freedom from corporate hand-holding… at first.

    But we’ll be waiting when you’re ready to take those first steps again.

  7. What’s missing in the robots.txt file is a restriction as to what robots can index the site. If site owners could restrict robots from certain IP addresses, this would reduce the spammers ability to stuff spam links in articles.

  8. Part of what seems wrong with nofollow, in general, is that websites are encouraged to use it to make web search engines work better (and/or reduce the misuse of the sites’ unmoderated pages), but the search engines don’t reveal how they use nofollow, and there’s no public record of how or if it works for what people think it works for.

    Nofollow is like a blackbox with indeterminate output. The results of increasing or changing the input (e.g., more links using nofollow): indeterminate. Why? It’s indeterminate.

  9. Personally I’d rather see wikipedia put nofollow on all their internal links. THAT would stop wikipedia from being in the top ten to just about every search term – something I’m extremely sick and tired of. Wikipedia is an interesting resource, but it’s NOT the number one most important link for anything.

  10. DensityDuck says:

    EdB: Personally, I think that the rise of Wikipedia as a resource has contributed directly to the fall in search-engine utility. After all, if you can go look it up on Wikipedia, then what’s the use of trying to look it up on Google? And so nobody cares that the first page of Google results is full of “Buy (SEARCH TERM) on iSuperDollarStore!”

  11. Random Wikidiot says:

    Even with though nofollow has been implemented, there has been a big jump in spam links, and very low quality links being added to articles on the Wikipedia.

  12. What’s missing in the robots.txt file is a restriction as to what robots can index the site. If site owners could restrict robots from certain IP addresses, this would reduce the spammers ability to stuff spam links in articles.

  13. No follow is not the solution. Wiki should do the moderation . It would encourage the geninue article writer to promote their site