October 12, 2024

Chilling and Warming Effects

For several years, the Chilling Effects Clearinghouse has cataloging the effects of legal threats on online expression and helping people to understand their rights. Amid all the chilling we continue to see, it’s welcome to see rays of sunshine when bloggers stand up to threats, helping to stop the cycle of threat-and-takedown.

The BoingBoing team did this the other day when they got a legal threat from Ralph Lauren’s lawyers over an advertisement they mocked on the BoingBoing blog for featuring a stick-thin model. The lawyers claimed copyright infringement, saying “PRL owns all right, title, and interest in the original images that appear in the Advertisements.” Other hosts pull content “expeditiously” when they receive these notices (as Google did when notified of the post on Photoshop Disasters), and most bloggers and posters don’t counter-notify, even though Chilling Effects offers a handy counter-notification form.

Not BoingBoing, they posted the letter (and the image again) along with copious mockery, including an offer to feed the obviously starved models, and other sources picked up on the fun. The image has now been seen by many more people than would have discovered it in BoingBoing’s archives, in a pattern the press has nicknamed the “Streisand Effect.”

We use the term “chilling effects” to describe indirect legal restraints, or self-censorship, because most cease-and-desist letters don’t go through the courts. The lawyers (and non-lawyers) sending them rely on the in terrorem effects of threatened legal action, and often succeed in silencing speech for the cost of an e-postage stamp.

Actions like BoingBoing’s use the court of public opinion to counter this squelching. They fight legalese with public outrage (in support of legal analysis), and at the same time, help other readers to understand they have similar rights. Further, they increase the “cost” of sending cease-and-desists, as they make potential claimants consider the publicity risks being made to look foolish, bullying, or worse.

For those curious about the underlying legalities here, the Copyright Act makes clear that fair use, including for the purposes of commentary, criticism, and news reporting, is not an infringement of copyright. See Chilling Effects’ fair use FAQ. Yet the DMCA notice-and-takedown procedure encourages ISPs to respond to complaints with takedown, not investigation and legal balancing. Providers like BoingBoing’s Priority Colo should also get credit for their willingness to back their users’ responses.

As a result of the attention, Ralph Lauren apologized for the image: “After further investigation, we have learned that we are responsible for the poor imaging and retouching that resulted in a very distorted image of a woman’s body. We have addressed the problem and going forward will take every precaution to ensure that the caliber of our artwork represents our brand appropriately.”

May the warming (and proper attention to the health of fashion models) continue!

[cross-posted at Chilling Effects]

Privacy as a Social Problem, Not a Technology Problem

Bob Blakley had an interesting post Monday, arguing that technologists tend to frame the privacy issue poorly. (I would add that many non-technologists use the same framing.) Here’s a sample:

That’s how privacy works; it’s not about secrecy, and it’s not about control: it’s about sociability. Privacy is a social good which we give to one another, not a social order in which we control one another.

Technologists hate this; social phenomena aren’t deterministic and programmers can’t write code to make them come out right. When technologists are faced with a social problem, they often respond by redefining the problem as a technical problem they think they can solve.

The privacy framing that’s going on in the technology industry today is this:

Social Frame: Privacy is a social problem; the solution is to ensure that people use sensitive personal information only in ways that are beneficial to the subject of the information.

BUT as technologists we can’t … control peoples’ behavior, so we can’t solve this problem. So instead let’s work on a problem that sounds similar:

Technology Frame: Privacy is a technology problem; since we can’t make people use sensitive personal information sociably, the solution is to ensure that people never see others’ sensitive personal information.

We technologists have tried to solve the privacy problem in this technology frame for about a decade now, and, not surprisingly (information wants to be free!) we have failed.

The technology frame isn’t the problem. Privacy is the problem. Society can and routinely does solve the privacy problem in the social frame, by getting the vast majority of people to behave sociably.

This is an excellent point, and one that technologists and policymakers would be wise to consider. Privacy depends, ultimately, on people and institutions showing a reasonable regard for the privacy interests of others.

Bob goes on to argue that technologies should be designed to help these social mechanisms work.

A sociable space is one in which people’s social and antisocial actions are exposed to scrutiny so that normal human social processes can work.

A space in which tagging a photograph publicizes not only the identities of the people in the photograph but also the identities of the person who took the photograph and the person who tagged the photograph is more sociable than a space in which the only identity revealed is that of the person in the photograph – because when the picture of Jimmy holding a martini washes up on the HR department’s desk, Jimmy will know that Johnny took it (at a private party) and Julie tagged him – and the conversations humans have developed over tens of thousands of years to handle these situations will take place.

Again, this is an excellent and underappreciated point. But we need to be careful how far we take it. If we go beyond Bob’s argument, and we say that good design of the kind he advocates can completely solve the online privacy problem, then we have gone too far.

Technology doesn’t just move old privacy problems online. It also creates new problems and exacerbates old ones. In the old days, Johnny and Julie might have taken a photo of Jimmy drinking at the office party, and snail-mailed the photo to HR. That would have been a pretty hostile act. Now, the same harm can arise from a small misunderstanding: Johnny and Julie might assume that HR is more tolerant, or that HR doesn’t watch Facebook; or they might not realize that a site allows HR to search for photos of Jimmy. A photo might be taken by Johnny and tagged by Julie, even though Johnny and Julie don’t know each other. All in all, the photo scenario is more likely to happen today than in the pre-Net age.

This is just one example of what James Grimmelmann calls Accidental Privacy Spills. Grimmelmann tells the story of a private email message that was forwarded and re-forwarded to thousands of people, not by malice but because many people made the seemingly harmless decision to forward it to a few friends. This would never have happened with a personal letter. (Personal letters are sometimes publicized against the wishes of the author, but that’s very rare and wouldn’t have happened in the case Grimmelmann describes.) As the cost of capturing, transmitting, storing, and searching photos and other digital information falls to near-zero, it’s only natural that more capturing, transmitting, storing, and searching of information will occur.

Good design is not the whole solution to our privacy problem. But design has the huge advantage that we can get started on it right away, without needing to reach some sweeping societal agreement about what the rules should be. If you’re designing a product, or deciding which product to use, you can support good privacy design today.

Introducing FedThread: Opening the Federal Register

Today we are rolling out FedThread, a new way of interacting with the Federal Register. It’s the latest civic technology project from our team at Princeton’s Center for Information Technology Policy.

The Federal Register is “[t]he official daily publication for rules, proposed rules, and notices of Federal agencies and organizations, as well as executive orders and other presidential documents.” It’s published by the U.S. government, five days a week. The Federal Register tells citizens what their government is doing, in a lot more detail than the news media do.

FedThread makes the Federal Register more open and accessible. FedThread gives users:

  • collaborative annotation: Users can attach a note to any paragraph of the Federal Register; a conversation thread hangs off of every paragraph.
  • advanced search: Users can search the Federal Register (going back to 2000) on full text, by date, agency, and other fields.
  • customized feeds: Any search can be turned into an RSS feed. The resulting feed will include any new items that match the search query. Feeds can be delivered by email as well.

I think FedThread is a nice tool, but what’s most amazing to me is that the whole project took only ten days to create. Ten days ago we had no code, no HTML, no plan, not even a block diagram on a whiteboard. Today we launched a pretty good service.

How was this possible? Three things enabled it.

First, government provided the necessary data, for bulk download, in a format (XML) that’s easy for software to handle. This let us acquire and manipulate the underlying data (Federal Register contents) quickly. Folks at the Government Printing Office, National Archives and Records Administration, and Office of Science and Technology Policy all helped to make this possible. The roll-out of the government’s XML-based Federal Register site today is a significant step forward.

Second, we had great tools, such as Linux, Apache, MySql, Python, Django, jQuery, Datejs, and lxml. These tools are capable, flexible, and free, and they fit together in useful ways. More than once we faced a challenging engineering problem, only to find an existing tool that did almost exactly what we needed. When we needed a tool for managing inline discussion threads within a document, Adrian Holovaty, Jacob Kaplan-Moss and Jack Slocum graciously let us use their code from djangobook.com, which served as the basis for our system. Tools like these help small teams build big projects quickly.

Third, we have a amazing team. A project like this needs people who are super-smart, tireless, have great engineering judgment, and know how to work as a team. Joe Calandrino, Ari Feldman, Harlan Yu, and Bill Zeller all did fantastic work building the site. We set an insane schedule — at the start we guessed we had a 50% chance of having anything at all ready by today — and they raced ahead of the schedule, to the point that we expanded the project’s scope more than once. Great job, guys! Now please get some sleep.

We hope FedThread is a useful tool that brings more people into contact with the operations of their government — one small step in a larger trend of using technology to make government more transparent.

Antisocial networking

I just got my invitation to Google Wave. The prototype that’s now public doesn’t have all of the amazing features in the original video demos. At this point, it’s pretty much just a way of collecting IM-style conversations all in one place. But several of my friends are already there, and I’ve had a few conversations there already.

How am I supposed to know that there’s something new going on at Wave? Right now, I need to keep a tab open in my browser and check in, every once in a while, to see what’s up. Right now, my standard set of tabs includes my Gmail, calendar, RSS reader, New York Times homepage, Facebook page, and now Google Wave. Add in the occasional Twitter tab (or dedicated Twitter client, if I feel like running it) plus I’ll occasionally have an IM window open. All of these things are competing for my attention when I’m supposed to be getting real work done.

A common way that people try to solve this problem is by building bridges between these services. If you use Twitter and Facebook, there are several ways to arrange for your tweets to show up at Facebook (bewildering Facebook users with all the #hashtags and @references) and there are also a handful of ways for getting data out of Facebook. I’d been using FriendFeed as a central hub for all this, but it would sometimes stop working for days at a time. Now that they’ve been bought out by Facebook, maybe this will shake itself out.

The bigger problem is that these various vendors and technologies have different data models for visibility and for how metadata is represented. In Twitter, everything is default-public, follow-up comments are first-class objects in the system, and there’s effectively no metadata outside of the message, causing Twitter users to have adopted a variety of seemingly obscure conventions (e.g., “RT” to indicate a retweet of some other tweet). Contrast this with Facebook, where comments are a very different sort of message from the parent messages, where they have all sorts of security rules (that nobody really understands) about who can see what, and where there is actually structure to a message. If I link to a Youtube video, it gets magically embedded, versus the annoying URL shorteners that people have to use to shoehorn messages into Twitter.

Comments are a favorite area for people to complain. Twitter comments are often implicit with the @username tags. If I’m following a friend and a friend-of-my-friend comments on one of their tweets, I won’t necessary see it. In Facebook, I have a better shot at seeing those comments. But what if I wrote a blog post here at Freedom to Tinker, which Facebook nicely picks it up and makes it look just like I posted a note on my Facebook page. Now we’ll have comments on Freedom to Tinker and more comments inside Facebook which won’t intermingle. Of course, thanks to FriendFeed, a tweet will (probably) be automatically generated when I post this, causing some small amount of Twitter commenting traffic, and there may be comments within FriendFeed itself as well as Google Reader commentary (which is also different from Google Reader’s “share with note” commentary).

Given these disparate data models, there’s no easy way to unify Twitter and Facebook, much less the commenting disaspora, even assuming you could sort out the security concerns and you could work around Facebook’s tendency to want to restrict the flow of data out of its system. This is all the more frustrating because RSS completely solved the initial problem of distributing new blog posts in the blog universe. I used to keep a bunch of tabs open to various blog-like things that I followed, but that quickly proved unwieldy, whereas an RSS aggregator (Google Reader, for me) solved the problem nicely. Could there ever be a social network/microblogging aggregator?

There are no lack of standards-in-the-wings that would like to do this. (See, for example, OpenMicroBlogging, or our own work on BirdFeeder.) Something like Google Wave could subsume every one of these platforms, although I fear that integrating so many different data models would inevitably result in a deeply clunky UI.

In the end, I think the federation ideas behind Google Wave and BirdFeeder, and good old RSS blog feeds, will ultimately win out, with interoperability between the big vendors, just like they interoperate with email. Getting there, however, isn’t going to happen easily.

Breaking Vanish: A Story of Security Research in Action

Today, seven colleagues and I released a new paper, “Defeating Vanish with Low-Cost Sybil Attacks Against Large DHTs“. The paper’s authors are Scott Wolchok (Michigan), Owen Hofmann (Texas), Nadia Heninger (Princeton), me, Alex Halderman (Michigan), Christopher Rossbach (Texas), Brent Waters (Texas), and Emmett Witchel (Texas).

Our paper is the next chapter in an interesting story about the making, breaking, and possible fixing of security systems.

The story started with a system called Vanish, designed by a team at the University of Washington (Roxana Geambasu, Yoshi Kohno, Amit Levy, and Hank Levy). Vanish tries to provide “vanishing data objects” (VDOs) that can be created at any time but will only be usable within a short time window (typically eight hours) after their creation. This is an unusual kind of security guarantee: the VDO can be read by anybody who sees it in the first eight hours, but after that period expires the VDO is supposed to be unrecoverable.

Vanish uses a clever design to do this. It takes your data and encrypts it, using a fresh random encryption key. It then splits the key into shares, so that a quorum of shares (say, seven out of ten shares) is required to reconstruct the key. It takes the shares and stores them at random locations in a giant worldwide system called the Vuze DHT. The Vuze DHT throws away items after eight hours. After that the shares are gone, so the key cannot be reconstructed, so the VDO cannot be decrypted — at least in theory.

What is this Vuze DHT? It’s a worldwide peer-to-peer network, containing a million or so computers, that was set up by Vuze, a company that uses the BitTorrent protocol to distribute (licensed) video content. Vuze needs a giant data store for its own purposes, to help peers find the videos they want, and this data store happens to be open so that Vanish can use it. The million-computer extent of the Vuze data store was important, because it gave the Vanish designers a big haystack in which to hide their needles.

Vanish debuted on July 20 with a splashy New York Times article. Reading the article, Alex Halderman and I realized that some of our past thinking about how to extract information from large distributed data structures might be applied to attack Vanish. Alex’s student Scott Wolchok grabbed the project and started doing experiments to see how much information could be extracted from the Vuze DHT. If we could monitor Vuze and continuously record almost all of its contents, then we could build a Wayback Machine for Vuze that would let us decrypt VDOs that were supposedly expired, thereby defeating Vanish’s security guarantees.

Scott’s experiments progressed rapidly, and by early August we were pretty sure that we were close to demonstrating a break of Vanish. The Vanish authors were due to present their work in a few days, at the Usenix Security conference in Montreal, and we hoped to demonstrate a break by then. The question was whether Scott’s already heroic sleep-deprived experimental odyssey would reach its destination in time.

We didn’t want to ambush the Vanish authors with our break, so we took them aside at the conference and told them about our preliminary results. This led to some interesting technical discussions with the Vanish team about technical details of Vuze and Vanish, and about some alternative designs for Vuze and Vanish that might better resist attacks. We agreed to keep them up to date on any new results, so they could address the issue in their talk.

As it turned out, we didn’t establish a break before the Vanish team’s conference presentation, so they did not have to modify their presentation much, and Scott finally got to catch up on his sleep. Later, we realized that evidence to establish a break had actually been in our experimental logs before the Vanish talk, but we hadn’t been clever enough to spot it at the time. Science is hard.

Some time later, I ran into my ex-student Brent Waters, who is now on the faculty at the University of Texas. I mentioned to Brent that Scott, Alex and I had been studying attacks on Vanish and we thought we were pretty close to making an attack work. Amazingly, Brent and some Texas colleagues (Owen Hoffman, Christopher Rossbach, and Emmett Witchel) had also been studying Vanish and had independently devised attacks that were pretty similar to what Scott, Alex, and I had.

We decided that it made sense to join up with the Texas team, work together on finishing and testing the attacks, and then write a joint paper. Nadia Heninger at Princeton did some valuable modeling to help us understand our experimental results, so we added her to the team.

Today we are releasing our joint paper. It describes our attacks and demonstrates that the attacks do indeed defeat Vanish. We have a working system that can decrypt Vanishing data objects (made with the original version of Vanish) after they are supposedly unrecoverable.

Our paper also discusses what went wrong in the original Vanish design. The people who designed Vanish are smart and experienced, but they obviously made some kind of mistake in their original work that led them to believe that Vanish was secure — a belief that we now know is incorrect. Our paper talks about where we think the Vanish authors went wrong, and what security practitioners can learn from the Vanish experience so far.

Meanwhile, the Vanish authors went back to the drawing board and came up with a bunch of improvements to Vanish and Vuze that make our attacks much more expensive. They wrote their own paper about their experience with Vanish and their new modifications to it.

Where does this leave us?

For now, Vanish should be considered too risky to rely on. The standard for security is not “no currently demonstrated attacks”, it is “strong evidence that the system resists all reasonable attacks”. By updating Vanish to resist our attacks, the Vanish authors showed that their system is not a dead letter. But in my view they are still some distance from showing that Vanish is secure . Given the complexity of underlying technologies such as Vuze, I wouldn’t be surprised if more attacks turn out to be possible. The latest version of Vanish might turn out to be sound, or to be unsound, or the whole approach might turn out to be flawed. It’s too early to tell.

Vanish is an interesting approach to a real problem. Whether this approach will turn out to work is still an open question. It’s good to explore this question — and I’m glad that the Vanish authors and others are doing so. At this point, Vanish is of real scientific interest, but I wouldn’t rely on it to secure my data.

[Update (Sept. 30, 2009): I rewrote the paragraphs describing our discussions with the Vanish team at the conference. The original version may have given the wrong impression about our intentions.]