January 13, 2025

The Journal Misunderstands Content-Delivery Networks

There’s been a lot of buzz today about this Wall Street Journal article that reports on the shifting positions of some of the leading figures of the network neutrality movement. Specifically, it claims that Google, Microsoft, and Yahoo! have abandoned their prior commitment to network neutrality. It also claims that Larry Lessig has “softened” his support for network neutrality, and it implies that because Lessig is an Obama advisor, that Lessig’s changing stance may portend a similar shift in the president-elect views, which would obviously be a big deal.

Unfortunately, the Journal seems to be confused about the contours of the network neutrality debate, and in the process it has mis-described the positions of at least two of the key players in the debate, Google and Lessig. Both were quick to clarify that their views have not changed.

At the heart of the dispute is a question I addressed in my recent Cato paper on network neutrality: do content delivery networks (CDNs) violate network neutrality? A CDN is a group of servers that improve website performance by storing content closer to the end user. The most famous is Akamai, which has servers distributed around the world and which sells its capacity to a wide variety of large website providers. When a user requests content from the website of a company that uses Akamai’s service, the user’s browser may be automatically re-directed to the nearest Akamai server. The result is faster load times for the user and reduced load on the original web server. Does this violate network neutrality? If you’ll forgive me for quoting myself, here’s how I addressed the question in my paper:

To understand how Akamai manages this feat, it’s helpful to know a bit more about what happens under the hood when a user loads a document from the Web. The Web browser must first translate the domain name (e.g., “cato.org”) into a corresponding IP address (72.32.118.3). It does this by querying a special computer called a domain name system (DNS) server. Only after the DNS server replies with the right IP address can the Web browser submit a request for the document. The process for accessing content via Akamai is the same except for one small difference: Akamai has special DNS servers that return the IP addresses of different Akamai Web servers depending on the user’s location and the load on nearby servers. The “intelligence” of Akamai’s network resides in these DNS servers.

Because this is done automatically, it may seem to users like “the network” is engaging in intelligent traffic management. But from a network router’s perspective, a DNS server is just another endpoint. No special modifications are needed to the routers at the core of the Internet to get Akamai to work, and Akamai’s design is certainly consistent with the end-to-end principle.

The success of Akamai has prompted some of the Internet’s largest firms to build CDN-style networks of their own. Google, Microsoft, and Yahoo have already started building networks of large data centers around the country (and the world) to ensure there is always a server close to each end user’s location. The next step is to sign deals to place servers within the networks of individual residential ISPs. This is a win-win-win scenario: customers get even faster response times, and both Google and the residential ISP save money on bandwidth.

The Journal apparently got wind of this arrangement and interpreted it as a violation of network neutrality. But this is a misunderstanding of what network neutrality is and how CDNs work. Network neutrality is a technical principle about the configuration of Internet routers. It’s not about the business decisions of network owners. So if Google signs an agreement with a major ISP to get its content to customers more quickly, that doesn’t necessarily mean that a network neutrality violation has occurred. Rather, we have to look at how the speed-up was accomplished. If, for example, it was accomplished by upgrading the network between the ISP and Google, network neutrality advocates would have no reason to object. In contrast, if the ISP accomplished by re-configuring its routers to route Google’s packets in preference to those from other sources, that would be a violation of network neutrality.

The Journal article had relatively few details about the deal Google is supposedly negotiating with residential ISPs, so it’s hard to say for sure which category it’s in. But what little description the Journal does give us—that the agreement would “place Google servers directly within the network of the service providers”—suggests that the agreement would not violate network neutrality. And indeed, over on its public policy blog, Google denies that its “edge caching” network violates network neutrality and reiterates its support for a neutral Internet. Don’t believe everything you read in the papers.

Election Transparency Project Finds Ballot-Counting Bug

Yesterday, Kim Zetter at Wired News reported an amazing e-voting story about lost ballots and the public advocates who found them.

Here’s a summary: Humboldt County, California has an innovative program to put on the Internet scanned images of all the optical-scan ballots cast in the county. In the online archive, citizens found 197 ballots that were not included in the official results of the November election. Investigation revealed that the ballots disappeared from the official count due to a programming error in central tabulation software supplied by Premier (formerly known as Diebold), the county’s e-voting vendor.

The details of the programming error are jaw-dropping. Here is Zetter’s deadpan description:

Premier explained that due to a programming problem, the first “deck” or batch of ballots that are counted by the GEMS software sometimes gets randomly deleted if any subsequent deck is intentionally deleted. The GEMS system names the first deck of ballots “deck 0”, with subsequent batches called “deck 1,” “deck 2,” etc. For some reason “deck 0” is sometimes erased from the system if any other deck is erased. Since it’s common for officials to intentionally erase a deck in the normal counting process if they’ve made an error and want to rescan a deck, the chance that a GEMS system containing this flaw will delete a batch of ballots is pretty high.

The system never provides any indication to election officials when it’s deleting a batch of ballots in this manner. The problem occurs with version 1.18.19 of the GEMS software, though it’s possible that other versions have the problem as well. [County election director Carolyn] Crnich said an official in the California secretary of state’s office told her the problem was still prevalent in version 1.18.22 of Premier’s software and wasn’t fixed until version 1.18.24.

Neither Premier nor the secretary of state’s office, which certifies voting systems for use in the state, has returned calls for comment about this.

After examining Humboldt’s database, Premier determined that the “deck 0” in Humboldt was deleted at some point in between processing decks 131 and 135, but so far Crnich has been unable to determine what caused the deletion. She said she did at one point abort deck 132, instead of deleting it, when she made a mistake with it, but that occurred before election day, and the “deck 0” batch of ballots was still in the system on November 23rd, after she’d aborted deck 132. She couldn’t recall deleting any other deck after election night or after the 23rd that might have caused “deck 0” to disappear in the manner Premier described.

The deletion of “deck 0” wasn’t the only problem with the GEMS system. As I mentioned previously, the audit log not only didn’t show that “deck 0” had been deleted, it never showed that the deck existed in the first place.

The system creates a “deck 0” for each ballot type that is scanned. This means, the system should have three “deck 0” entries in the log — one for vote-by-mail ballots, one for provisional ballots, and one for regular ballots cast at the precinct. Crnich found that the log did show a “deck 0” for provisional ballots and precinct-cast ballots but none for vote-by-mail ballots, even though the machine had printed a receipt at the time that an election worker had scanned the ballots into the machine. In fact, the regular audit log provides no record of any files that were deleted, including deck 132, which she intentionally deleted. She said she had to go back to a backup of the log, created before the election, to find any indication that “deck 0” had ever been created.

I don’t know which is more alarming: that the vendor failed to treat as an emergency a programming error that silently deletes ballots, or that the tabulator’s “audit log” looks more like an after the fact reconstruction of what-must-have-happened rather than a log of what actually did happen.

The good news here is that Humboldt County’s opening of election records to the public paid off, when members of the public found important facts in the records that officials and the vendor had missed. If other jurisdictions opened their records, how many more errors would we find and fix?

On the future of voting technologies: simplicity vs. sophistication

Yesterday, I testified before a hearing of Colorado’s Election Reform Commission. I made a small plug, at the end of my testimony, for a future generation of electronic voting machines that would use crypto machinery for end-to-end / software independent verification. Normally, the politicos tend to ignore this and focus on the immediately actionable stuff (e.g., current-generation DREs are unacceptably insecure; optical-scan is the best thing presently on the market). Not this time. I got a bunch of questions asking me to explain how a crypto voting system can be verifiable, how you can prove that the machine is behaving properly, and so forth. Pretty amazing. What I realized, however, is that it’s really hard to explain crypto machinery to non-CS people. I did my best, but it was clear from conversations afterward that a few minutes of Q&A did little to give them any confidence that crypto voting machinery really works.

Another of the speakers, Neil McBurnett, was talking about doing variable sampling-rate audits (as a function of how close the tally is). Afterward, he lamented to me, privately, how hard it is to explain basic concepts like what it means for something to be “statistically significant.”

There’s a clear common theme here. How do we explain to the public the basic scientific theories that underly the problems that voting systems face? My written testimony (reused from an earlier hearing in Texas) includes links to papers, and some people will follow up. Others won’t. My big question is whether we have a research challenge to invent progressively simpler systems that still have the right security properties, or whether we have an education challenge to explain that a certain amount of complexity is worthwhile for the good properties that can be achieved. (Uglier question: is it a desirable goal to weaken the security properties in return for greater simplicity? What security properties would you sacrifice?)

Certainly, with our own VoteBox system, which uses a variation on Benaloh‘s voter-initiated ballot challenge mechanism, one of the big open questions is whether real voters, who just want to cast their votes and don’t care about the security mechanisms, will be tripped up by the extra question at the end that’s fundamental to the mechanism. We’re going to need to run human subject tests against these aspects of the machine design, and if they fail in practice, it’s going to be a trip back to the drawing board.

[Sidebar: I’m co-teaching a class on elections with Bob Stein (a political scientist) and Mike Byrne (a psychologist). The students are a mix of Rice undergrads, most of whom aren’t computer scientists. I experimentally built a lecture that began by teaching just enough number theory to explain how El Gamal cryptography works and how it allows for homomorphic vote tallying. Then I described how VoteBox uses this mechanism, and wrapped up with an explanation of how to do Benaloh-style challenges. I left out a lot of details, like how you generate large prime numbers, or how you construct NIZK proofs, but I seemed to have the class along with me for the lecture. If I can sell the idea of end-to-end cryptographic mechanisms to undergraduate non-science students, then there may yet be some hope.]

Discerning Voter Intent in the Minnesota Recount

Minnesota election officials are hand-counting millions of ballots, as they perform a full recount in the ultra-close Senate race between Norm Coleman and Al Franken. Minnesota Public Radio offers a fascinating gallery of ballots that generated disputes about voter intent.

A good example is this one:

A scanning machine would see the Coleman and Franken bubbles both filled, and call this ballot an overvote. But this might be a Franken vote, if the voter filled in both slots by mistake, then wrote “No” next to Coleman’s name.

Other cases are more difficult, like this one:

Do we call this an overvote, because two bubbles are filled? Or do we give the vote to Coleman, because his bubble was filled in more completely?

Then there’s this ballot, which is destined to be famous if the recount descends into ligitation:

[Insert your own joke here.]

This one raises yet another issue:

Here the problem is the fingerprint on the ballot. Election laws prohibit voters from putting distinguishing marks on their ballots, and marked ballots are declared invalid, for good reason: uniquely marked ballots can be identified later, allowing a criminal to pay the voter for voting “correctly” or punish him for voting “incorrectly”. Is the fingerprint here an identifying mark? And if so, how can you reject this ballot and accept the distinctive “Lizard People” ballot?

Many e-voting experts advocate optical-scan voting. The ballots above illustrate one argument against opscan: filling in the ballot is a free-form activity that can create ambiguous or identifiable ballots. This creates a problem in super-close elections, because ambiguous ballots may make it impossible to agree on who should have won the election.

Wearing my pure-scientist hat (which I still own, though it sometimes gets dusty), this is unsurprising: an election is a measurement process, and all measurement processes have built-in errors that can make the result uncertain. This is easily dealt with, by saying something like this: Candidate A won by 73 votes, plus or minus a 95% confidence interval of 281 votes. Or perhaps this: Candidate A won with 57% probability. Problem solved!

In the real world, of course, we need to declare exactly one candidate to be the winner, and a lot can be at stake in the decision. If the evidence is truly ambiguous, somebody is going to end up feeling cheated, and the most we can hope for is a sense that the rules were properly followed in determining the outcome.

Still, we need to keep this in perspective. By all reports, the number of ambiguous ballots in Minnesota is miniscule, compared to the total number cast in Minnesota. Let’s hope that, even if some individual ballots don’t speak clearly, the ballots taken collectively leave no doubt as to the winner.

The future of photography

Several interesting things are happening in the wild world of digital photography as it’s colliding with digital video. Most notably, the new Canon 5D Mark II (roughly $2700) can record 1080p video and the new Nikon D90 (roughly $1000) can record 720p video. At the higher end, Red just announced some cameras that will ship next year that will be able to record full video (as fast as 120 frames per second in some cases) at far greater than HD resolutions (for $12K, you can record video at a staggering 6000×4000 pixels). You can configure a Red camera as a still camera or as a video camera.

Recently, well-known photographer Vincent Laforet (perhaps best known for his aerial photographs, such as “Me and My Human“) got his hands on a pre-production Canon 5D Mark II and filmed a “mock commercial” called “Reverie”, which shows off what the camera can do, particularly its see-in-the-dark low-light abilities. If you read Laforet’s blog, you’ll see that he’s quite excited, not just about the technical aspects of the camera, but about what this means to him as a professional photographer. Suddenly, he can leverage all of the expensive lenses that he already owns and capture professional-quality video “for free.” This has all kinds of ramifications for what it means to cover an event.

For example, at professional sporting events, video rights are entirely separate from the “normal” still photography rights given to the press. It’s now the case that every pro photographer is every bit as capable of capturing full resolution video as the TV crew covering the event. Will still photographers be contractually banned from using the video features of their cameras? Laforet investigated while he was shooting the Beijing Olympics:

Given that all of these rumours were going around quite a bit in Beijing [prior to the announcement of the Nikon D90 or Canon 5D Mark II] – I sat down with two very influential people who will each be involved at the next two Olympic Games. Given that NBC paid more than $900 million to acquire the U.S. Broadcasting rights to this past summer games, how would they feel about a still photographer showing up with a camera that can shoot HD video?

I got the following answer from the person who will be involved with Vancouver which I’ll paraphrase: Still photographers will be allowed in the venues with whatever camera they chose, and shoot whatever they want – shooting video in it of itself, is not a problem. HOWEVER – if the video is EVER published – the lawsuits will inevitably be filed, and credentials revoked etc.

This to me seems like the reasonable thing to do – and the correct approach. But the person I spoke with who will be involved in the London 2012 Olympic Games had a different view, again I paraphrase: “Those cameras will have to be banned. Period. They will never be allowed into any Olympic venue” because the broadcasters would have a COW if they did. And while I think this is not the best approach – I think it might unfortunately be the most realistic. Do you really think that the TV producers and rights-owners will “trust” photographers not to broadcast anything they’ve paid so much for. Unlikely.

Let’s do a thought experiment. Red’s forthcoming “Scarlet FF35 Mysterium Monstro” will happily capture 6000×4000 pixels at 30 frames per second. If you multiply that out, assuming 8 bits per pixel (after modest compression), you’re left with the somewhat staggering data rate of 720MB/s (i.e., 2.6TB/hour). Assuming you’re recording that to the latest 1.5TB hard drives, that means you’re swapping media every 30 minutes (or you’re tethered to a RAID box of some sort). Sure, your camera now weighs more and you’re carrying around a bunch of hard drives (still lost in the noise relative to the weight that a sports photographer hauls around in those long telephoto lenses), but you manage to completely eliminate the “oops, I missed the shot” issue that dogs any photographer. Instead, the “shoot” button evolves into more of a bookmarking function. “Yeah, I think something interesting happened around here.” It’s easy to see photo editors getting excited by this. Assuming you’ve got access to multiple photographers operating from different angles, you can now capture multiple views of the same event at the same time. With all of that data, synchronized and registered, you could even do 3D reconstructions (made famous/infamous by the “bullet time” videos used in the Matrix films or the Gap’s Khaki Swing commercial). Does the local newspaper have the rights to do that to an NFL game or not?

Of course, this sort of technology is going to trickle down to gear that mere mortals can afford. Rather than capturing every frame, maybe you now only keep a buffer of the last ten seconds or so, and when you press the “shoot” button, you get to capture the immediate past as well as the present. Assuming you’ve got a sensor that let’s you change the exposure on the fly, you can also now imagine a camera capturing a rapid succession of images at different exposures. That means no more worries about whether you over or under-exposed your image. In fact, the camera could just glue all the images together into a high-dynamic-range (HDR) image, which yields sometimes fantastic results.

One would expect, in the cutthroat world of consumer electronics, that competition would bring features like this to market as fast as possible, although that’s far from a given. If you install third-party firmware on a Canon point-and-shoot, you get all kinds of functionality that the hardware can support but which Canon has chosen not to implement. Maybe Canon would rather you spend more money for more features, even if the cheaper hardware is perfectly capable. Maybe they just want to make common feature easy to use and not overly clutter the UI. (Not that any camera vendors are doing particularly well on ease of use, but that’s a topic for another day.)

Freedom to Tinker readers will recognize some common themes here. Do I have the right to hack my own gear? How will new technology impact old business models? In the end, when industries collide, who wins? My fear is that the creative freelance photographer, like Laforet, is likely to get pushed out by the big corporate sponsor. Why allow individual freelancers to shoot a sports event when you can just spread professional video cameras all over the place and let newspapers buy stills from those video feeds? Laforet discussed these issues at length; his view is that “traditional” professional photography, as a career, is on its way out and the future is going to be very, very different. There will still be demand for the kind of creativity and skills that a good photographer can bring to the game, but the new rules of the game have yet to be written.