December 9, 2022

Abandoning the Envelope Analogy (What Your Mailman Knows Part 2)

Last time, I commented on NPR’s story about a mail carrier named Andrea in Seattle who can tell us something about the economic downturn by revealing private facts about the people she serves on her mail route. By critiquing the decision to run the story, I drew a few lessons about the way people value and weigh privacy. In Part 2 of this series, I want to tie this to NebuAd and Phorm.

It’s probably a sign of the deep level of monomania to which I’ve descended that as I listened to the story, I immediately started drawing connections between Andrea and NebuAd/Phorm. Technology policy almost always boils down to a battle over analogies, and many in the ISP surveillance/deep packet inspection debate embrace the so-called envelope analogy. (See, e.g., the comments of David Reed to Congress about DPI, and see the FCC’s Comcast/BitTorrent order.) Just as mail carriers are prohibited from opening closed envelopes, so a typical argument goes, so too should packet carriers be prohibited from looking “inside” the packets they deliver.

As I explain in my article, I’m not a fan of the envelope analogy. The NPR story gives me one more reason to dislike it: envelopes–the physical kind–don’t mark as clear a line of privacy as we may have thought. Although Andrea is restricted by law from peeking inside envelopes, every day her mail route is awash in “metadata” that reveal much more than the mere words scribbled on the envelopes themselves. By analyzing all of this metadata, Andrea has many ways of inferring what is inside the envelopes she delivers, and she feels pretty confident about her guesses.

There are metadata gleaned from the envelopes themselves: certified letters usually mean bad economic news; utility bills turn from white to yellow to red as a person slides toward insolvency. She also engages in traffic analysis–fewer credit card offers might herald the credit crunch. She picks up cues from the surroundings, too: more names on a mailbox might mean that a young man who can no longer make rent has moved in with grandma. Perhaps most importantly, she interacts with the human recipients of these envelopes, reporting in the story about a guy who runs a cafe who jokes about needing credit card offers in order to pay the bill, or describing the people who watch her approach with “a real desperation in their eyes; when they see me their face falls; what am I going to bring today?”

So let’s stop using the envelope analogy, because it makes a comparison that doesn’t really fit well. But I have a deeper objection to the use of the envelope analogy in the DPI/ISP surveillance debate: It states a problem rather than proposes a solution, and it assumes away all of the hard questions. Saying that there is an “inside” and an “outside” to a packet is the same thing as saying that we need to draw a line between permissible and impermissible scrutiny, but it offers no guidance about how or where to draw that line. The promise of the envelope analogy is that it is clear and easy to apply, but the solutions proposed to implement the analogy are rarely so clear.

What Your Mailman Knows (Part 1 of 2)

A few days ago, National Public Radio (NPR) tried to offer some lighter fare to break up the death march of gloomier stories about economic calamity. You can listen to the story online. The story’s reporter, Chana Joffe-Walt, followed a mail carrier named Andrea on her route around the streets of Seattle. The premise of the story is that Andrea can measure economic suffering along her mail route–and therefore in that mythical place, “Main Street”–by keeping tabs on the type of mail she delivered. I have two technology policy thoughts about this story, but because I have a lot to say, I will break this into two posts. In this post, I will share some general thoughts about privacy, and in the next post, I will tie this story to NebuAd and Phorm.

I was troubled by Andrea’s and Joffe-Walt’s cavalier approaches to privacy. In the course of the five minute story, Andrea reveals a lot of private, personal information about the people on her route. Only once does Joffe-Walt even hint at the creepiness of peering into people’s private lives in this way, embracing a form of McNealy’s “you have no privacy, get over it” declaration. In the first line of the story, Joffe-Walt says, “Okay before we can do this, I need to clear up one question: Yes, your mailman reads your postcards; she notices what magazines you get, which catalogs; she knows everything about you.” The last line of the story is simply, “The government is just starting on its $700 billion plan. As it moves forward, Wall Street economists will be watching Wall Street; Fed economists will be watching Wall Street; Andrea will be watching the mail.”

There are many privacy lessons I can draw from this: First, did the Postal Service approve Andrea’s participation in the interview? If it did, did it weigh the privacy impact? If not, why not?

More broadly speaking, I bet all of the people who produced or authorized this story, from Andrea and Joffe-Walt to the Postal Service and NPR, if they thought about privacy at all, engaged in a cost-benefits balancing, and they evidently made the same types of mistakes on both sides of that balancing that people often make when they think about privacy.

First, what are the costs to privacy from this story? At first blush, they seem to be slight to non-existent because the reporter anonymized the data. Although most of the activity in the story appears to center on one city block in Seattle, we aren’t told which city block. This is a lot like AOL arguing that it had anonymized its search queries by replacing IP addresses with unique identifiers or like Phorm arguing that it protects privacy by forgetting that you visited and remembering instead only that you visited a travel-related website.

The NPR story exposes the flaw in this type of argument. Although a casual listener won’t be able to place the street toured by Andrea, it probably wouldn’t be very hard to pierce this cloak of privacy. In the story, we are told that the street is “three-quarters of a mile [north] of” Main Street. The particular block is “a wide residential block where section 8 housing butts against glassy, snazzy new chic condos that cost half-a-million dollars.” Across the block are a couple businesses including a cafe “across the way.” Does this describe more than a few possible locations in Seattle? [Insert joke about the number of cafes in Seattle here.]

It’s probably even easier for someone who lives in Seattle to pinpoint the location, particularly if it is near where they live or work. For these people, thanks to NPR, they now know that in the Section 8 building lives “a single mom with an affinity for black leather is getting an overdraft notice” and a “minister . . . getting more late payment bills.” The owner of the cafe has been outed as somebody who pays his bills only by applying for new credit cards. If you lived or worked on this particular block, wouldn’t you have at least a hunch about the identities of the people tied to these potentially embarrassing facts?

Laboring under the mistaken belief that anonymization negated any costs to privacy, the creators of the story probably thought the costs were outweighed by the potential benefits. But these benefits seem to pale in comparison to the privacy risks, accurately understood. What does the listener gain by listening to this story? A small bit of anecdotal knowledge about the economic crisis? A reason to fear his mailman? The small thrill of voyeurism? A chance to think about the economic crisis while not seized by fear and dread? I’m not saying that these benefits are valueless, but I don’t think they were justified when held against the costs.

Opting In (or Out) is Hard to Do

Thanks to Ed and his fellow bloggers for welcoming me to the blog. I’m thrilled to have this opportunity, because as a law professor who writes about software as a regulator of behavior (most often through the substantive lenses of information privacy, computer crime, and criminal procedure), I often need to vet my theories and test my technical understanding with computer scientists and other techies, and this will be a great place to do it.

This past summer, I wrote an article (available for download online) about ISP surveillance, arguing that recent moves by NebuAd/Charter, Phorm, AT&T, and Comcast augur a coming wave of unprecedented, invasive deep-packet inspection. I won’t reargue the entire paper here (the thesis is no doubt much less surprising to the average Freedom to Tinker reader than to the average lawyer) but you can read two bloggy summaries I wrote here and here or listen to a summary I gave in a radio interview. (For summaries by others, see [1] [2] [3] [4]).

Two weeks ago, Verizon and AT&T told Congress that they would monitor for marketing purposes only users who had opted in. According to Verizon VP Tom Tauke, “[B]efore a company captures certain Internet-usage data for targeted or customized advertising purposes, it should obtain meaningful, affirmative consent from consumers.”

I applaud this announcement, but I’m curious how the ISPs will implement this promise. It seems like there are two architectural puzzles here: how does the user convey consent, and how does the provider distinguish between the packets of consenting and nonconsenting users? For an ISP, neither step is nearly as straightforward as it is for a web provider like Google, which can simply set and check cookies. For the first piece, I suppose a user can click a check box on a web-based form or respond to an e-mail, letting the ISP know he would like to opt in. These solutions seem clumsy, however, and ISPs probably want a system that is as seamless and easy to use as possible, to maximize the number of people opting in.

Once ISPs have a “white list” of users who have opted in, how do they turn this into on-the-fly discretionary packet sniffing? Do they map white-listed users to IP addresses and add these to a filter, or is there a risk that things will get out of sync during dhcp lease renewals? Can they use cookies, perhaps redirecting every http session to an ISP-run web server first using 301 http status codes? (This seems to be the way Phorm implements opt-out, according to Richard Clayton’s illuminating analysis.) Do any of these solutions scale for an ISP with hundreds of thousands of users?

And are things any easier if the ISP adopts an opt-out system instead?