November 22, 2024

Archives for 2008

Hot Custom Car (software?)

I’ve found Tim’s bits on life post-driving interesting. I’ve sometimes got a one-track mind, though- so what I really want to know is if I’ll be able to hack on that self-driving car. I mentioned this to Tim, and he said he wasn’t sure either- so here is my crack at it.

We’re not very good at making choices like this. Historically, liability constrained software development at large institutions (the airlines had a lot of reasons not to let people to hack on their airplanes) and benign neglect was sufficient to regulate hacking of personal software- if you hacked your PC or toaster, no one cared because it had no impact (a form of Lessig’s regulation by architecture). The net result was that we didn’t need to regulate software very much, we got lots of innovation from individual developers, and we stayed bad at making choices like ‘how should we regulate people’s ability to hack?’

Individuals are now beginning to own hackable devices that can also harm the neighbors, though, so the space in between large institution and isolated hacker is filling up. For example, the FCC regulates your ability to modify your own wireless devices, so that you can’t interfere with other people’s spectrum. And some of Prof. Jonathan Zittrain’s analysis suggests that we might want to even regulate PCs, since they can now frequently be vectors for spam and viruses. Tim and I are normally fairly anti-regulation, and pro-open source, but even we are aware that cars running around all over the place driven by by potentially untested code might also fit in this gap- and be more worthy of regulation.

So what should happen? Should we be able to hack our cars (more than we already do), and if so, under what conditions?

It’d help if we could better measure the risks and benefits involved. Unfortunately, probably because we regulate software so rarely, our metrics for assessing the risks and benefits of software development aren’t very good. One such metric is Prof. Zittrain’s ‘generativity’; Dan Wallach’s proposal to measure the ‘O(n)’ of potential system damage is another. Neither are perfect fits here, but that only confirms that we need more such tools in our software policy toolkit.

This lack of tools shouldn’t stop us from some basic, common-sense analysis, though. On the pro side, the standard arguments for open source apply, though perhaps not as strongly as usual, since many casual hackers might be discouraged at the thought of hacking their own car. We probably would want car manufacturers to pool their safety expertise, which would be facilitated by openness. Finally, we might also want open code for auditing reasons- with millions of lives on the line, this seems like a textbook case for wanting ‘many eyes‘ to take a look at the code.

If we accept these arguments on the ‘pro’ hacking side, what then? First, we could require that the car manufacturers use test-driven development, and share those tests with the public- perhaps even allowing the public to add new tests. This would help avoid serious safety problems in the ‘original’ code, and home hackers might be blocked from loading new code into their cars unless the code was certified to have passed the tests. Second, we could treat the consequences very seriously- ‘driving’ with bad code could be treated similarly to DUI. Third, we could make sure that the safety fallbacks (emergency brake systems, etc.) are in separate, redundant (and perhaps only mechanical?) unhackable systems. Having such systems is going to be good engineering whether the code is open or not, and making them unhackable might be a good compromise. (Serious engineers, instead of compsci BAs now in law school, should feel free to suggest other techniques in the comments.)

Bottom line? First, we don’t really know- we just have pretty poor analytical tools for this type of problem. But if we take a stab at it, we can see that there are some potential solutions that might be able to harness the innovation and generativity of open source in our cars without significantly compromising our safety. At least, not compromising it any moreso than the already crazy core idea 🙂

[picture is ‘Car Show 2‘, by starmist1, used under the CC-BY license.]

Life after Driving

I’m working on a three-part series on self-driving automobile technology for Ars Technica. In part one I covered the state of existing self-driving technology and highlighted the dramatic progress that has been made in recent years. In part two, I assume that the remaining technical hurdles can be surmounted and examine what the world might look like when self-driving cars become ubiquitous. The potential benefits are enormous: autonomous vehicles could save thousands of lives, billions of person-hours, and billions of dollars of energy costs.

The article has sparked interesting discussion around the blogosphere. Matt Yglesias has a long-standing interest in urban planning issues, so he did a post about the urban planning implications of self-driving technologies. I argue that by making taxis cheaper, self-driving cars would shift a lot of people from owning cars to renting them. And that, in turn, would dramatically reduce demand for parking lots, which will allow more pleasant, high-density cities. It’s hard to overstate the extent to which the need for parking exacerbates sprawl and congestion problems. Parking lots consume vast amounts of land in suburban areas. This, in turn, means that stuff is farther apart, which forces people to rely even more on their cars to get from place to place.

Matt’s post prompted a number of interesting responses. Ryan Avent chimed in with some thoughts about how self-driving technologies would make urban living more attractive. On the other hand Tom Lee offers a counterpoint: making car travel cheaper and more convenient will, on the margin, cause people to drive (or “ride” anyway) more. This is a good point, and it’s not clear how these factors would balance out. But even if Tom is right, this wouldn’t be an entirely bad thing. Increased mobility is a virtue in its own right.

I think Atrios and Kevin Drum are on less firm ground when they argue that this technology is so far in the future that it’s not worth thinking about. Drum compares self-driving technologies to cold fusion and human-level AI, while Atrios compares them to flying cars and jet packs. I can only assume they didn’t read the first installment of my series, in which I discuss the state of the technology in some detail. The basic technology for self-driving is already here. There are cars in university laboratories that can navigate for hundreds of miles without human supervision, and can interact safely with other cars on urban streets. Of course, there’s still a lot of work to do to enable these vehicles to safely handle the multiplicity of obstacles they would encounter in real urban environments. And after that the technology will need to be made reliable and affordable enough for commercial use. But these problems are nowhere close to the difficulty of human-level AI. Your car doesn’t have to understand why you want to go to the store in order to find a safe path from here to there. If you’re skeptical that this technology can be made to work, I encourage you to read my first article and watch PBS’s excellent documentary on the 2005 DARPA Grand Challenge. There’s a lot of uncertainty about how long until this technology will be mature enough to let loose on our streets, but I think it’s pretty clearly a matter of “when,” not “if.”

Cloud(s), Hype, and Freedom

Richard Stallman’s recent description of ‘the cloud’ as ‘hype’ and a ‘trap’ seems to have stirred up a lot of commentary, but not a lot of clear discussion of the problems Stallman raised. This isn’t surprising- the term ‘the cloud’ has always been vague. (It was hard to resist saying ‘cloudy.’ 😉 When people say ‘the cloud’ they are really lumping at least four ‘cloud types’ together.

traditional applications, hosted elsewhere

Probably the most common type of ‘cloud’ is a service that takes a traditional software functionality and moves it to remotely hosted, (typically) web-delivered servers. Gmail and salesforce.com are like this- fairly traditional email and CRM applications, ‘just’ moved to the web.

If Stallman’s ‘hype’ claim is valid anywhere, it is here. Administration and maintenance costs are definitely lower when an expert like Google funds and runs the server, and reliability may improve as well. But the core functionality of these apps, and the ability to access data over a network, have been present since the dawn of networked computing. On average, this is undoubtedly a significant change in quality, but only rarely a change in type- making the buzz much harder to justify.

Stallman’s ‘trap’ charge is more complex. Computer users have long compromised on personal control by storing data remotely but accessing it via standardized protocols. This introduced risks- you had to trust the data host and couldn’t tinker with the server- but kept some controls- you could switch clients, and typically you could export the data. Some web apps still strike that balance- for example, most gmail features are accessible via good old POP and IMAP. But others don’t.

Getting your data out of a service like salesforce can be a ‘hidden cost’ of an apparently free service, and even with a relatively standards-based service like gmail you have no freedom to make changes to the server. These risks are what Stallman means when he talks about a ‘trap’, and regardless of your conclusion about them, understanding them is important.

services involving data that can’t (yet) be managed locally

Google Maps and Google Search are the canonical examples of this type of cloud service- heaps of data so large that one would need a large data center to host your own copy and a very, very fat pipe to keep it up-to-date.

Hype-wise, these are a mixed bag. These services definitely bring radical new functionality that traditionally can’t exist- I can’t store all of google maps on my phone. That hype is justified. At the same time, our personal ability to store and process data is still growing quickly, so the claims that this type of cloud service will always ‘require’ remote servers may be overblown.

‘Trap’-wise? Dependence on these services reminds me of ‘dependence’ on a library before the internet- you can work to make sure your library respects your privacy, prefer public libraries to private ones, or establish a personal library if your reading interests are narrow, but in the end eschewing large libraries is likely to be a case of cutting off your nose to spite your face. We’re in the same state with this type of cloud service. You can avoid them, but those concerned with freedom might be better off understanding and fixing them than condemning them altogether.

services that make creation of new data technically or economically feasible

Facebook and wikipedia are the canonical examples here. Unlike the first two types of cloud, where data was available but inconvenient before it ended up in the cloud, this class of cloud applications creates information that wasn’t previously feasible to collect at all.

There may well not be enough hype around this type of cloud. Replicating web scale collaborative facilities like these will be very difficult to do in a p2p fashion, and the impact of the creation of new information (even when it is as mundane as facebook’s data often is) is hard to understate.

Like the previous type of cloud, it is hard to call these a trap per se- they do make it hard to leave, but they do so by providing new functionality that is very hard to get with any traditional software model.

services offering computing and storage, rather than data

The most recent type of cloud service is remotely provisioned computing and storage, like Amazon’s EC2/S3 and Google’s App Engine. This is perhaps the most purely generative type of cloud, allowing individuals to create new services and scale them out to service millions of people without having to invest in their own physical infrastructure. It is hard to see any way in which this can reasonably be called ‘hype,’ given the reach it allows individuals and small or transient groups to have which might otherwise cost them many thousands of dollars.

From a freedom perspective, these can be both the best and worst of the cloud types. On the plus side, these services can be incredibly transparent- developers who use them directly have access to their own source code, and end users may not know they are using them at all. On the down side, especially for proprietary platforms like App Engine, these can have very deep lock-in- it is complicated, expensive, and risky to switch deployment platforms after achieving success. And they replace traditional, very open platforms- a tradeoff that isn’t always appreciated.

takeaways

‘The cloud’ isn’t going away, but hopefully we can clarify our thinking about it by talking about the different types of clouds. Hopefully this post is a useful step in that direction.

[This post is an extension of some ideas I’ve been playing around with on my own blog and at the autonomo.us group blog; readers curious about these issues may want to read further in those places. I also recommend reading this piece, which set me on the (very long) road to this particular post.]

Why is printing so hard?

Recently I bought a mildly used laser printer and wanted to set it up on my home network. In a better world, this would be a trivial exercise — just connect the printer to the network and let the computers discover it. In the actual world, it was a forty-five minute project that only a reasonably handy network jockey could have hoped to complete. (If you care about what exactly I had to do, see below.)

John Hartman says, “Printing is the hardest problem in computer science.” It often seems that way. But why?

Plug-and-play printing seems pretty simple, compared to many of the things that computers do routinely without trouble. Granted, it’s not trivial to get the full variety of printers to work with the full variety of computers, but our collective failure to do so is — or should be — surprising.

There must be some lesson here about engineering, or human nature, or something. Lately I’ve gone around asking people why printing is so hard. I’ve gotten some interesting answers, but I don’t think I really understand the issue yet.

What do you think? Why is printing so hard?

[For the record, here’s what I had to do to get our newly acquired HP LaserJet 2200DN printer working on our home network: I plugged the printer in to our network, but the Windows PCs couldn’t auto-discover the printer. I Googled the printer’s user manual, which said the printer had a built-in webserver. But I didn’t know the printer’s IP address, so I had to log in to our router and look at its DHCP tables. Knowing the IP address, I could connect to the printer’s webserver, which had a page telling me what URL to use for IPP printing. (I had to know what IPP was.) After that, I assigned the printer a static IP address, so the IPP URL (containing an IP address) would keep working across reboots. Now that I had a stable IPP URL, I could set up the PCs for printing. Finally, I had to guess which of driver to use on Windows — two drivers were offered, with no advice about which one to use, but only one of the offered drivers supports duplex printing. Total elapsed time: about 45 minutes.]

California Issues Emergency Election Audit Regulations

The Office of the California Secretary of State has issued a set of proposed emergency regulations for post-election manual tallying of paper election records. In this post, my first at FTT, I’ll try to explain and contextualize this development.

Since her election to office, California Secretary of State (CA SoS) Debra Bowen has methodically studied the shortcomings in California’s election equipment. She first initiated a Top-To-Bottom review (TTBR) of California’s voting systems that found them to be of poor technical quality and vulnerable to a myriad of security vulnerabilities, accessibility flaws, reliability issues and inadequate documentation and testing (a number of FTT regulars participated in the TTBR). For this year’s presidential primary in California, Bowen worked to mitigate these problems by decertifying this equipment and then recertifying it subject to a list of about 40 different conditions. One such condition is that the usual 1% manual tally under California law — counties must randomly choose and hand tally ballots cast in 1% of precincts — would be modified to include escalation that would mandate increased tallying for close races (where even small amounts of possible fraud and/or error could make a difference in the outcome of a contest).

Bowen issued these additional requirements (the “PEMT Requirements”) under her authority as CA SoS to regulate election technologies (here are the original PEMT Requirements). Unfortunately, the Registrar in San Diego County sued Bowen arguing that she 1) didn’t have such broad authority and 2) that, even if she did, she could only issue the PEMT Requirements through the California regulatory procedure (specified by the CA Administrative Procedure Act). A state Superior Court found in favor of the CA SoS but a Court of Appeal found that the PEMT Requirements did indeed betray characteristics of regulations and should therefore have gone through the regulatory procedure (for the legal eagles out there, see: County of San Diego v. Debra Bowen (2008) 166 Cal.App.4th 501).

By the time the Court of Appeal had made its decision on August 29, there was no time to follow the normal regulatory process, which takes about four months. Instead, the CA SoS had to follow the process for adopting an emergency regulation which applies when a regulation “is necessary for the immediate preservation of the public peace, health and safety, or general welfare.”

What is so special about these emergency manual tally provisions? First, it represents the increasing relevance and importance of adversarial considerations in the design of an election audit process. As we describe in the NYU Brennan Center / UC Berkeley Samuelson Clinic report on post-election audits (“Post-Election Audits: Restoring Trust In Elections”), fixed-percentage audits of election records are only particularly useful in detecting wide-ranging anomalies in vote counts. Methods that “tune” the amount of records audited depending on the margin in contests on the ballot do a much better job of ensuring that they’ll find evidence of possible error or fraud. Per the emergency PEMT Regulations, any contest with a margin (difference between the winning and losing choice in a contest) of 0.5% or lower is subject to a 10% manual tally, an order of magnitude more scrutiny than the statutory default.

Second, the CA SoS’ emergency PEMT Regulations reflect many best practices from audit theory and research: precincts to audit must be chosen randomly; the precincts to audit are only chosen after the semi-official vote tallies are arrived at; tally activities must be announced publicly and available for public observation; tallies must be conducted under “blind count” rules where the talliers do not know the totals in the precincts they’re tallying; differences between machine and hand counts must be explained or investigated.

The elephant in the room is always Los Angeles County; LA is so amazingly enormous for an election jurisdiction that some things simply aren’t possible. (For example, they frequently pick up ballot materials from precincts in helicopters; that is, traffic in LA is so bad and there are so many polling places (~5,000 or so) that the most reliable form of ballot transmission is via helicopter.) These rules are going to be exceedingly difficult for LA to comply with. I expect they will hire an army of tally managers and talliers to perform their tally and that it will be a race against the clock, counting 24 hours a day, seven days per week, to try and get it all done in the 28-calendar day canvass period.