February 28, 2017

Archives for November 2008

Economic Growth, Censorship, and Search Engines

Economic growth depends on an ability to access relevant information. Although censorship prevents access to certain information, the direct consequences of censorship are well-known and somewhat predictable. For example, blocking access to Falun Gong literature is unlikely to harm a country’s consumer electronics industry. On the web, however, information of all types is interconnected. Blocking a web page might have an indirect impact reaching well beyond that page’s contents. To understand this impact, let’s consider how search results are affected by censorship.

Search engines keep track of what’s available on the web and suggest useful pages to users. No comprehensive list of web pages exists, so search providers check known pages for links to unknown neighbors. If a government blocks a page, all links from the page to its neighbors are lost. Unless detours exist to the page’s unknown neighbors, those neighbors become unreachable and remain unknown. These unknown pages can’t appear in search results — even if their contents are uncontroversial.

When presented with a query, search engines respond with relevant known pages sorted by expected usefulness. Censorship also affects this sorting process. In predicting usefulness, search engines consider both the contents of pages and the links between pages. Links here are like friendships in a stereotypical high school popularity contest: the more popular friends you have, the more popular you become. If your friend moves away, you become less popular, which makes your friends less popular by association, and so on. Even people you’ve never met might be affected.

“Popular” web pages tend to appear higher in search results. Censoring a page distorts this popularity contest and can change the order of even unrelated results. As more pages are blocked, the censored view of the web becomes increasingly distorted. As an aside, Ed notes that blocking a page removes more than just the offending material. If censors block Ed’s site due to an off-hand comment on Falun Gong, he also loses any influence he has on information security.

These effects would typically be rare and have a disproportionately small impact on popular pages. Google’s emphasis on the long tail, however, suggests that considerable value lies in providing high-quality results covering even less-popular pages. To avoid these issues, a government could allow limited individuals full web access to develop tools like search engines. This approach seems likely to stifle competition and innovation.

Countries with greater censorship might produce lower-quality search engines, but Google, Yahoo, Microsoft, and others can provide high-quality search results in those countries. These companies can access uncensored data, mitigating the indirect effects of censorship. This emphasizes the significance of measures like the Global Network Initiative, which has a participant list that includes Google, Yahoo, and Microsoft. Among other things, the initiative provides guidelines for participants regarding when and how information access may be restricted. The effectiveness of this specific initiative remains to be seen, but such measures may provide leading search engines with greater leverage to resist arbitrary censorship.

Search engines are unlikely to be the only tools adversely impacted by the indirect effects of censorship. Any tool that relies on links between information (think social networks) might be affected, and repressive states place themselves at a competitive disadvantage in developing these tools. Future developments might make these points moot: in a recent talk at the Center, Ethan Zuckerman mentioned tricks and trends that might make censorship more difficult. In the meantime, however, governments that censor information may increasingly find that they do so at their own expense.

Does Your House Need a Tail?

Thus far, the debate over broadband deployment has generally been between those who believe that private telecom incumbents should be in charge of planning, financing and building next-generation broadband infrastructure, and those who advocate a larger role for government in the deployment of broadband infrastructure. These proposals include municipal-owned networks and a variety of subsidies and mandates at the federal level for incumbents to deploy faster broadband.

Tim Wu and Derek Slater have a great new paper out that approaches the problem from a different perspective: that broadband deployments could be planned and financed not by government or private industry, but by consumers themselves. That might sound like a crazy idea at first blush, but Wu and Slater do a great job of explaining how it might work. The key idea is “condominium fiber,” an arrangement in which a number of neighboring households pool their resources to install fiber to all the homes in their neighborhoods. Once constructed, each home would own its own fiber strand, while the shared costs of maintaining the “trunk” cable from the individual homes to a central switching location would be managed in the same way that condominium and homeowners’ associations currently manage the shared areas of condos and gated communities. Indeed, in many cases the developer of a new condominium tower or planned community could lay fiber along with water and power lines, and the fiber would be just one of the shared resources that would be managed collectively by the homeowners.

If that sounds strange, it’s important to remember that there are plenty of examples where things that were formerly rented became owned. For example, fifty years ago in the United States no one owned a telephone. The phone was owned by Ma Bell and if yours broke they’d come and install a new one. But that changed, and now people own their phones and the wiring inside their homes, with your phone company owning the cable outside the home. One way to think about Slater and Wu’s “homes with tails” concept is that it’s just shifting that line of demarcation again. Under their proposal, you’d own the wiring inside your home and the line from you to your broadband provider.

Why would someone want to do such a thing? The biggest advantage, from my perspective, is that it could solve the thorny problem of limited competition in the “last mile” of broadband deployment. Right now, most customers have two options for high-speed Internet access. Getting more options using the traditional, centralized investment model is going to be extremely difficult because it costs a lot to deploy new infrastructure all the way to customers’ homes. But if customers “brought their own” fiber, then the barrier to entry would be much lower. New providers would simply need to bring a single strand of fiber to a neighborhood’s centralized point of presence in order to offer service to all customers in that neighborhood. So it would be much easier to imagine a world in which customers had numerous options to choose from.

The challenge is solving the chicken-and-egg problem: customer owned fiber won’t be attractive until there are several providers to choose from, but it doesn’t make sense for new firms to enter this market until there are a significant number of neighborhoods with customer-owned fiber. Wu and Slater suggest several ways this chicken-and-egg problem might be overcome, but I think it will remain a formidable challenge. My guess is that at least at the outset, the customer-owned model will work best in new residential construction projects, where the costs of deploying fiber will be very low (because they’ll already be digging trenches for power and water).

But the beauty of their model is that unlike a lot of other plans to encourage broadband deployment, this isn’t an all-or-nothing choice. We don’t have to convince an entire nation, state, or even city to sign onto a concept like this. All you need is a neighborhood with a few dozen early-adopting consumers and an ISP willing to serve them. Virtually every cutting-edge technology is taken up by a small number of early adopters (who pay high prices for the privilege of being the first with a new technology) before it spreads to the general public, and the same model is likely to apply to customer-owned fiber. If the concept is viable, someone will figure out how to make it work, and their example will be duplicated elsewhere. So I don’t know if customer-owned fiber is the wave of the future, but I do hope that people start experimenting with it.

You can check out their paper here. You can also check out an article I wrote for Ars Technica this summer that is based on conversations with Slater, Wu, and other pioneers in this area.

Discerning Voter Intent in the Minnesota Recount

Minnesota election officials are hand-counting millions of ballots, as they perform a full recount in the ultra-close Senate race between Norm Coleman and Al Franken. Minnesota Public Radio offers a fascinating gallery of ballots that generated disputes about voter intent.

A good example is this one:

A scanning machine would see the Coleman and Franken bubbles both filled, and call this ballot an overvote. But this might be a Franken vote, if the voter filled in both slots by mistake, then wrote “No” next to Coleman’s name.

Other cases are more difficult, like this one:

Do we call this an overvote, because two bubbles are filled? Or do we give the vote to Coleman, because his bubble was filled in more completely?

Then there’s this ballot, which is destined to be famous if the recount descends into ligitation:

[Insert your own joke here.]

This one raises yet another issue:

Here the problem is the fingerprint on the ballot. Election laws prohibit voters from putting distinguishing marks on their ballots, and marked ballots are declared invalid, for good reason: uniquely marked ballots can be identified later, allowing a criminal to pay the voter for voting “correctly” or punish him for voting “incorrectly”. Is the fingerprint here an identifying mark? And if so, how can you reject this ballot and accept the distinctive “Lizard People” ballot?

Many e-voting experts advocate optical-scan voting. The ballots above illustrate one argument against opscan: filling in the ballot is a free-form activity that can create ambiguous or identifiable ballots. This creates a problem in super-close elections, because ambiguous ballots may make it impossible to agree on who should have won the election.

Wearing my pure-scientist hat (which I still own, though it sometimes gets dusty), this is unsurprising: an election is a measurement process, and all measurement processes have built-in errors that can make the result uncertain. This is easily dealt with, by saying something like this: Candidate A won by 73 votes, plus or minus a 95% confidence interval of 281 votes. Or perhaps this: Candidate A won with 57% probability. Problem solved!

In the real world, of course, we need to declare exactly one candidate to be the winner, and a lot can be at stake in the decision. If the evidence is truly ambiguous, somebody is going to end up feeling cheated, and the most we can hope for is a sense that the rules were properly followed in determining the outcome.

Still, we need to keep this in perspective. By all reports, the number of ambiguous ballots in Minnesota is miniscule, compared to the total number cast in Minnesota. Let’s hope that, even if some individual ballots don’t speak clearly, the ballots taken collectively leave no doubt as to the winner.