April 26, 2024

Copyright, Technology, and Access to the Law

James Grimmelmann has an interesting new essay, “Copyright, Technology, and Access to the Law,” on the challenges of ensuring that the public has effective knowledge of the laws. This might sound like an easy problem, but Grimmelmann combines history and explanation to show why it can be difficult. The law – which includes both legislators’ statutes and judges’ decisions – is large, complex, and ever-changing.

Suppose I gave you a big stack of paper containing all of the laws ever passed by Congress (and signed by the President). This wouldn’t be very useful, if what you wanted was to know whether some action you were contemplating would violate the law. How would you find the laws bearing on that action? And if you did find such a law, how would you determine whether it had been repealed or amended later, or how courts had interpreted it?

Making the law accessible in practice, and not just in theory, requires a lot of work. You need reliable summaries, topic-based indices, reverse-citation indices (to help you find later documents that might affect the meaning of earlier ones), and so on. In the old days of paper media, all of this had to be printed and distributed in large books, and updated editions had to be published regularly. How to make this happen was an interesting public policy problem.

The traditional answer has been copyright. Generally, the laws themselves (statutes and court opinions) are not copyrightable, but extra-value content such as summaries and indices can be copyrighted. The usual theory of copyright applies: give the creators of extra-value content some exclusive rights, and the profit motive will ensure that good content is created.

This has some similarity to our Princeton model for government transparency, which urges government to publish information in simple open formats, and leave it to private parties to organize and present the information to the public. Here government was creating the basic information (statutes and court opinions) and private parties were adding value. It wasn’t exactly our model, as government was not taking care to publish information in the form that best facilitated private re-use, but it was at least evidence for our assertion that, given data, private parties will step in and add value.

All of this changed with the advent of computers and the Internet, which made many of the previously difficult steps cheaper and easier. For example, it’s much easier to keep a website up to date than to deliver updates to the owners of paper books. Computers can easily construct citation indices, and a search engine provides much of the value of a printed index. Access to the laws can be cheaper and easier now.

What does this mean for public policy? First, we can expect more competition to deliver legal information to the public, thanks to the reduced barriers to entry. Second, as competition drives down prices we’ll see fewer entities that are solely in the business of providing access to laws; instead we’ll see more non-profits, along with businesses providing free access. More competition and lower prices will mean better and more effective access to the law for citizens. Third, copyright will still play a role by supporting the steps that remain costly, such as the writing of summaries.

Finally, it will matter more than ever exactly how government provides access to the raw information. If, as sometimes happens now, government provides the raw information in an awkward or difficult-to-use form, private actors must invest in converting it into a more usable form. These investments might not have mattered much in the past when the rest of the process was already expensive; but in the Internet age they can make a big difference. Given access to the right information in the right format, one person can produce a useful mashup or visualization tool with a few weeks of spare-time work. Government, by getting the details of data publication right, can enable a flood of private innovation, not to mention a better public debate.

Government Data and the Invisible Hand

David Robinson, Harlan Yu, Bill Zeller, and I have a new paper about how to use infotech to make government more transparent. We make specific suggestions, some of them counter-intuitive, about how to make this happen. The final version of our paper will appear in the Fall issue of the Yale Journal of Law and Technology. The best way to summarize it is to quote the introduction:

If the next Presidential administration really wants to embrace the potential of Internet-enabled government transparency, it should follow a counter-intuitive but ultimately compelling strategy: reduce the federal role in presenting important government information to citizens. Today, government bodies consider their own websites to be a higher priority than technical infrastructures that open up their data for others to use. We argue that this understanding is a mistake. It would be preferable for government to understand providing reusable data, rather than providing websites, as the core of its online publishing responsibility.

In the current Presidential cycle, all three candidates have indicated that they think the federal government could make better use of the Internet. Barack Obama’s platform explicitly endorses “making government data available online in universally accessible formats.” Hillary Clinton, meanwhile, remarked that she wants to see much more government information online. John McCain, although expressing excitement about the Internet, has allowed that he would like to delegate the issue, possible to a vice-president.

But the situation to which these candidates are responding – the wide gap between the exciting uses of Internet technology by private parties, on the one hand, and the government’s lagging technical infrastructure on the other – is not new. The federal government has shown itself consistently unable to keep pace with the fast-evolving power of the Internet.

In order for public data to benefit from the same innovation and dynamism that characterize private parties’ use of the Internet, the federal government must reimagine its role as an information provider. Rather than struggling, as it currently does, to design sites that meet each end-user need, it should focus on creating a simple, reliable and publicly accessible infrastructure that “exposes” the underlying data. Private actors, either nonprofit or commercial, are better suited to deliver government information to citizens and can constantly create and reshape the tools individuals use to find and leverage public data. The best way to ensure that the government allows private parties to compete on equal terms in the provision of government data is to require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large.

Our approach follows the engineering principle of separating data from interaction, which is commonly used in constructing websites. Government must provide data, but we argue that websites that provide interactive access for the public can best be built by private parties. This approach is especially important given recent advances in interaction, which go far beyond merely offering data for viewing, to offer services such as advanced search, automated content analysis, cross-indexing with other data sources, and data visualization tools. These tools are promising but it is far from obvious how best to combine them to maximize the public value of government data. Given this uncertainty, the best policy is not to hope government will choose the one best way, but to rely on private parties with their vibrant marketplace of engineering ideas to discover what works.

To read more, see our preprint on SSRN.

Online Symposium: Voluntary Collective Licensing of Music

Today we’re kicking off an online symposium on voluntary collective licensing of music, over at the Center for InfoTech Policy site.

The symposium is motivated by recent movement in the music industry toward the possibility of licensing large music catalogs to consumers for a fixed monthly fee. For example, Warner Music, one of the major record companies, just hired Jim Griffin to explore such a system, in which Internet Service Providers would pay a per-user fee to record companies in exchange for allowing the ISPs’ customers to access music freely online. The industry had previously opposed collective licenses, making them politically non-viable, but the policy logjam may be about to break, making this a perfect time to discuss the pros and cons of various policy options.

It’s an issue that evokes strong feelings – just look at the comments on David’s recent post.

We have a strong group of panelists:

  • Matt Earp is a graduate student in the i-school at UC Berkeley, studying the design and implementation of voluntary collective licensing systems.
  • Ari Feldman is a Ph.D. candidate in computer science at Princeton, studying computer security and information policy.
  • Ed Felten is a Professor of Computer Science and Public Affairs at Princeton.
  • Jon Healey is an editorial writer at the Los Angeles Times and writes the paper’s Bit Player blog, which focuses on how technology is changing the entertainment industry’s business models.
  • Samantha Murphy is an independent singer/songwriter and Founder of SMtvMusic.com.
  • David Robinson is Associate Director of the Center for InfoTech Policy at Princeton.
  • Fred von Lohmann is a Senior Staff Attorney at the Electronic Frontier Foundation, specializing in intellectual property matters.
  • Harlan Yu is a Ph.D. candidate in computer science at Princeton, working at the intersection of computer science and public policy.

Check it out!

The Security Mindset and "Harmless Failures"

Bruce Schneier has an interesting new essay about how security people see the world. Here’s a sample:

Uncle Milton Industries has been selling ant farms to children since 1956. Some years ago, I remember opening one up with a friend. There were no actual ants included in the box. Instead, there was a card that you filled in with your address, and the company would mail you some ants. My friend expressed surprise that you could get ants sent to you in the mail.

I replied: “What’s really interesting is that these people will send a tube of live ants to anyone you tell them to.”

Security requires a particular mindset. Security professionals – at least the good ones – see the world differently. They can’t walk into a store without noticing how they might shoplift. They can’t use a computer without wondering about the security vulnerabilities. They can’t vote without trying to figure out how to vote twice. They just can’t help it.

This kind of thinking is not natural for most people. It’s not natural for engineers. Good engineering involves thinking about how things can be made to work; the security mindset involves thinking about how things can be made to fail. It involves thinking like an attacker, an adversary or a criminal. You don’t have to exploit the vulnerabilities you find, but if you don’t see the world that way, you’ll never notice most security problems.

I’ve often speculated about how much of this is innate, and how much is teachable. In general, I think it’s a particular way of looking at the world, and that it’s far easier to teach someone domain expertise – cryptography or software security or safecracking or document forgery – than it is to teach someone a security mindset.

The ant farm story illustrates another aspect of the security mindset. Your first reaction to the might have been, “So what? What’s so harmful about sending a package of ordinary ants to an unsuspecting person?” Even Bruce Schneier, who has the security mindset in spades, doesn’t point to any terrible consequence of misdirecting the tube of ants. (You might worry about the ants’ welfare, but in that case ant farms are already problematic.) If you have the security mindset, you’ll probably find the possibility of ant misdirection to be irritating; you’ll feel that something should have been done about it; and you’ll probably file it away in your mental attic, in case it becomes relevant later.

This interest in “harmless failures” – cases where an adversary can cause an anomalous but not directly harmful outcome – is another hallmark of the security mindset. Not all “harmless failures” lead to big trouble, but it’s surprising how often a clever adversary can pile up a stack of seemingly harmless failures into a dangerous tower of trouble. Harmless failures are bad hygiene. We try to stamp them out when we can.

To see why, consider the donotreply.com email story that hit the press recently. When companies send out commercial email (e.g., an airline notifying a passenger of a flight delay) and they don’t want the recipient to reply to the email, they often put in a bogus From address like . A clever guy registered the domain donotreply.com, thereby receiving all email addressed to donotreply.com. This included “bounce” replies to misaddressed emails, some of which contained copies of the original email, with information such as bank account statements, site information about military bases in Iraq, and so on. Misdirected ants might not be too dangerous, but misdirected email can cause no end of trouble.

The people who put donotreply.com email addresses into their outgoing email must have known that they didn’t control the donotreply.com domain, so they must have thought of any reply messages directed there as harmless failures. Having gotten that far, there are two ways to avoid trouble. The first way is to think carefully about the traffic that might go to donotreply.com, and realize that some of it is actually dangerous. The second way is to think, “This looks like a harmless failure, but we should avoid it anyway. No good can come of this.” The first way protects you if you’re clever; the second way always protects you.

Which illustrates yet another part of the security mindset: Don’t rely too much on your own cleverness, because somebody out there is surely more clever and more motivated than you are.

Privacy and the Commitment Problem

One of the challenges in understanding privacy is how to square what people say about privacy with what they actually do. People say they care deeply about privacy and resent unexpected commercial use of information about them; but they happily give that same information to companies likely to use and sell it. If people value their privacy so highly, why do they sell it for next to nothing?

To put it another way, people say they want more privacy than the market is producing. Why is this? One explanation is that actions speak louder than words, people don’t really want privacy very much (despite what they say), and the market is producing an efficient level of privacy. But there’s another possibility: perhaps a market failure is causing underproduction of privacy.

Why might this be? A recent Slate essay by Reihan Salam gives a clue. Salam talks about the quandry faced by companies like the financial-management site Wesabe. A new company building up its business wants to reassure customers that their information will be treated with the utmost case. But later, when the company is big, it will want to monetize the same customer information. Salam argues that these forces are in tension and few if any companies will be able to stick with their early promises to not be evil.

What customers want, of course, is not good intentions but a solid commitment from a company that it will stay privacy-friendly as it grows. The problem is that there’s no good way for a company to make such a commitment. In principle, a company could make an ironclad legal commitment, written into a contract with customers. But in practice customers will have a hard time deciphering such a contract and figuring out how much it actually protects them. Is the contract enforceable? Are there loopholes? The average customer won’t have a clue. He’ll do what he usually does with a long website contract: glance briefly at it, then shrug and click “Accept”.

An alternative to contracts is signaling. A company will say, repeatedly, that its intentions are pure. It will appoint the right people to its advisory board and send its executives to say the right things at the right conferences. It will take conspicuous, almost extravagant steps to be privacy-friendly. This is all fine as far as it goes, but these signals are a poor substitute for a real commitment. They aren’t too difficult to fake. And even if the signals are backed by the best of intentions, everything could change in an instant if the company is acquired – a new management team might not share the original team’s commitment to privacy. Indeed, if management’s passion for privacy is holding down revenue, such an acquisition will be especially likely.

There’s an obvious market failure here. If we postulate that at least some customers want to use web services that come with strong privacy commitments (and are willing to pay the appropriate premium for them), it’s hard to see how the market can provide what they want. Companies can signal a commitment to privacy, but those signals will be unreliable so customers won’t be willing to pay much for them – which will leave the companies with little incentive to actually protect privacy. The market will underproduce privacy.

How big a problem is this? It depends on how many customers would be willing to pay a premium for privacy – a premium big enough to replace the revenue from monetizing customer information. How many customers would be willing to pay this much? I don’t know. But I do know that people might care a lot about privacy, even if they’re not paying for privacy today.