October 11, 2024

What are the Constitutional Limits on Online Tracking Regulations?

As the conceptual contours of Do Not Track are being worked out, an interesting question to consider is whether such a regulation—if promulgated—would survive a First Amendment challenge. Could Do Not Track be an unconstitutional restriction on the commercial speech of online tracking entities? The answer would of course depend on what restrictions a potential regulation would specify. However, it may also depend heavily on the outcome of a case currently in front of the Supreme Court—Sorrell v. IMS Health Inc.—that challenges the constitutionality of a Vermont medical privacy law.

The privacy law at issue would restrict pharmacies from selling prescription drug records to data mining companies for marketing purposes without the prescribing doctor’s consent. These drug records each contain extensive details about the doctor-patient relationship, including “the prescriber’s name and address, the name, dosage and quantity of the drug, the date and place the prescription is filled and the patient’s age and gender.” A doctor’s prescription record can be tracked very accurately over time, and while patient names are redacted, each patient is assigned a unique identifier so their prescription histories may also be tracked. Pharmacies have been selling these records to commercial data miners, who in turn aggregate the data and sell compilations to pharmaceutical companies, who then engage in direct marketing back to individual doctors using a practice known as “detailing.” Sound familiar yet? It’s essentially brick-and-mortar behavioral advertising, and a Do Not Track choice mechanism, for prescription drugs.

The Second Circuit recently struck down the Vermont law on First Amendment grounds, ruling first that the law is a regulation of commercial speech and second that the law’s restrictions fall on the wrong side of the Central Hudson test—the four-step analysis used to determine the constitutionality of commercial speech restrictions. This ruling clashes explicitly with two previous decisions in the First Circuit, in Ayotte and Mills, which deemed that similar medical privacy laws in Maine and New Hampshire were constitutional. As such, the Supreme Court decided in January to take the case and resolve the disagreement, and the oral argument is set for April 26th.

I’m not a lawyer, but it seems like the outcome of Sorrell could have a wide-ranging impact on current and future information privacy laws, including possible Do Not Track regulations. Indeed, the petitioners recognize the potentially broad implications of their case. From the petition:

“Information technology has created new and unprecedented opportunities for data mining companies to obtain, monitor, transfer, and use personal information. Indeed, one of the defining traits of the so-called “Information Age” is this ability to amass information about individuals. Computers have made the flow of data concerning everything from personal purchasing habits to real estate records easier to collect than ever before.”

One central question in the case is whether a restriction on access to these data for marketing purposes is a restriction on legitimate commercial speech. The Second Circuit believes it is, reasoning that even “dry information” sold for profit—and already in the hands of a private actor—is entitled to First Amendment protection. In contrast, the First Circuit in Ayotte posited that the information being exchanged has “itself become a commodity,” not unlike beef jerky, so such restrictions are only a limitation on commercial conduct—not speech—and therefore do not implicate any First Amendment concerns.

A major factual difference here, as compared to online privacy and tracking, is that pharmacies are required by many state and federal laws to collect and maintain prescription drug records, so there may be more compelling reasons for the state to restrict access to this information.

In the case of online privacy, it could be argued that Internet users are voluntarily supplying information to the tracking servers, even though many users probably don’t intend to do this, nor do they expect that this is occurring. Judge Livingston, in her circuit dissent in Sorrell, notes that different considerations apply where the government is “prohibiting a speaker from conveying information that the speaker already possesses,” distinguishing that from situations where the government restricts access to the information itself. In applying this to online communications, at what point does the server “possess” the user’s data—when the packets are received and are sitting in a buffer or when the packets are re-assembled and the data permanently stored? Is there a constitutional difference between restrictions on collection versus restrictions on use? The Supreme Court in 1965 in Zemel v. Rusk stated that “the right to speak and publish does not carry with it the unrestrained right to gather information.” To what extent does this apply to government restrictions of online tracking?

The constitutionality of state and federal information privacy laws have historically and consistently been called into question, and things would be no different if—and it’s a big if— Congress grants the FTC authority over online tracking. When considering technical standards and what “tracking” means, it’s worth keeping in mind the possible constitutional challenges insofar as state action may be involved, as some desirable options to curb online tracking may only be possible within a voluntary or self-regulatory framework. Where that line is drawn will depend on how the Supreme Court comes down in Sorrell and how broadly they decide the case.

A public service rant: please fix your bibliography

Like many academics, I spend a lot of time reading and reviewing technical papers. I find myself continually surprised at the things that show up in the bibliography, so I thought it might be worth writing this down all in one place so that future conferences and whatnot might just hyperlink to this essay and say “Do That.”

Do not use BibTeX entries that are auto-generated from Citeseer, DBLP, the ACM Digital Library, or any other such thing. It’s stunning how many errors these contain. One glaring example: papers that appeared in the Symposium on Operating System Principles (SOSP) often turn out as citations to ACM Operating Systems Review. While that’s not incorrect, it’s also not the proper way to cite the paper. Another common error is that auto-generated citations inevitably have the wrong address, if they have it at all. (Hint: the ACM’s headquarters are in New York but almost all of their conferences are elsewhere. If you have “New York” anywhere in your bib file, there’s a good chance it should be something else.)

Leave out LNCS volume numbers and such for conferences. Many, many conferences have their proceedings appear as LNCS volumes. That’s nice, but it consumes unnecessary space in your bibliography. All I need to know is that we’re looking at CRYPTO ’86. I don’t need to know that it’s also LNCS vol. 263.

For most any paper, leave out the editors. I need to know who wrote the paper, not who was the program chair of the conference or editor of the journal.

For most any conference paper, leave out the publisher or organization. I don’t need to see Springer-Verlag, USENIX Association, or ACM Press. For journal papers, you need to use your discretion. Sometimes the name of the association is part of the journal name, so there’s no real need to repeat it. The only places where I regularly include organization names are technical reports, technical manuals and documentation, and published books.

Be consistent with how you cite any conference. For space reasons, you may wish to contract a conference name, only listing “SOSP ’03” rather than “Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03)”. That’s fine, at least for big conferences like SOSP where everybody should have heard of it, but use the same contraction throughout your bibliography. If you say “SOSP ’03” in one place and “Proceedings of SOSP ’03” somewhere else, that’s really annoying. Top tip: if you’re space constrained, the easiest thing to nuke is the string “Proceedings of”.

Make sure you have the right author list and with the proper initials. When you use BibTeX and you plug in “D.S. Wallach”, what comes out is “D. Wallach” since there’s no whitespace before the “S”. It’s damn near impossible to catch these things in your source file by eye, so you should do a regular-expression search ([A-Z].[A-Z].) or proofread the resulting bibliography. I’ve sometimes seen citations to papers where there were co-authors missing. Please double-check this sort of thing, often by visiting the authors’ home pages or conference home pages.

Be consistent with spelling out names versus using initials. Most bib styles just use initials rather than whole names. However, if you’re using a style that uses whole names, make sure that you’ve got the whole name for every citation in your bibliography. (Or switch styles.)

Always include a URL for blogs, Wikipedia articles, and newspaper articles (or, at least, newspaper articles since the dawn of the web). Stock BibTeX styles don’t know what URLs are, so the easiest solution is to use the “note” field. Make sure you put the url in a url{} environment so it becomes a hyperlink in the resulting PDF. I’m less confident I can advise you to always include a string like “Accessed on 11/08/2010”. But if you do it, do it consistently. Top tip: if you say usepackage{url} urlstyle{sf} in your LaTeX header, you’ll get more compact URLs than the stock typewriter font. See also, urlbst.

Don’t use a citation just to point to a software project. If you need to give credit to a software package you used, just drop a footnote and put the URL there. You only need a citation when you’re citing an actual paper of some sort. However, if there’s a research paper or book that was written by the authors of the software you used, and that paper or book describes the software, then you should cite the paper/book, and possibly include the URL for the software in the citation.

BibTeX sometimes fails when given a long URL in the note field. This manifests itself as a %-character and a newline inserted in the generated bbl file. (Why? I have no idea.) I have a short Perl script that I always work into my Makefile that post-processes the bbl file to fix this. So should you.

Eliminate the string “to appear” from your bibliography. Somebody years from now will look back in time and find these sorts of markers amusing. Worse, you can easily forget you put that in your bib file. It’s odd reading a manuscript in 2011 that cites a paper “to appear” in 2009.

For any conference, include the address, month, and year. And for the month, use three letter codes in your BibTeX (jan, feb, mar, apr, …) without quotation marks. The BibTeX style will deal with expanding those or using proper contractions. For the address, be consistent about how you handle them. Don’t say “Berkeley, CA” in one place and “Berkeley, California” in another. Also, this may be my U.S.-centric bias showing through, but you don’t need to add “U.S.A.” after “Berkeley, CA”. For international addresses, however, you should include the country and the state/region is optional. “Paris, France” is an easy one. I’ll have to defer to my Canadian readers to chime in about whether it’s better to cite “Vancouver, B.C., Canada”, “Vancouver, Canada”, or “Vancouver, B.C.”

But not the page numbers. Back in the old days, I once got razzed by a journal editor for not including page numbers in all my citations. (And you think I’m pedantic!) Given how many conferences are ditching printed proceedings altogether, it’s acceptable to leave these out now, including for old references that you’re far more likely to dig up online than in the printed proceedings.

Double-check any author with accents in their name and try to get it right. BibTeX doesn’t seem to play nicely with Unicode characters, at least for me, so you have to use the LaTeX codes instead. I’m sure David Mazières appreciates it when you spell his name right.

Double-check the capitalization of your paper titles. I tend to use the BibTeX “abbrv” style, which forces lower case for every word in your paper title, excepting the first word. You then have to put curly braces around words that truly need capital letters like BitTorrent or something. Some hand-written bib entries I’ve seen put curly braces around every word because they really, really want the entry with lots of capital letters. Don’t do that. Use a different bib style if you want different behavior, but then make sure your resulting bibliography has consistent capitalization for every entry. I don’t particularly care whether you go with lots of capital letters or not, but please be consistent about it. Also, double check that proper nouns are properly capitalized.

When you post your own papers online, post a bib entry next to them. This might encourage people to cite your paper properly. For your personal web page, you might like the Exhibit API, which can turn BibTeX entries into HTML, dynamically. (See Ben Adida’s page for one example.) If you’re setting up something for your whole lab or department, Drupal Scholar seems pretty good. (See my colleague Lydia Kavraki’s lab page for one example. I’m expecting we will adopt this across our entire CS department.)

And, last but not least, a citation is not a noun. When you cite a paper, it’s grammatically the same as making a parenthetical remark. If you need to refer to a paper as a noun, you need to use the author names (“Alice and Bob [23] showed that the halting problem is hard.”) or the name of the system (“The Chrome web browser [47,48] uses separate processes for each tab to improve fault isolation.”) If there are three or more authors, then you just use the first one with “et al.” (“Alice et al. [24] proved P is not equal to NP.” — note also the lack of italics for “et al.”) For the ACM journal style, there’s something called citeN rather than the usual cite, which is worth using. You can also look into using various additional packages to get similar functionality in any LaTeX paper style like natbib.

Obligatory caveat: A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. – Ralph Waldo Emerson

Things overheard on the WiFi from my Android smartphone

Today in my undergraduate security class, we set up a sniffer so we could run Wireshark and Mallory to listen in on my Android smartphone. This blog piece summarizes what we found.

  • Google properly encrypts traffic to Gmail and Google Voice, but they don’t encrypt traffic to Google Calendar. An eavesdropper can definitely see your calendar transactions and can likely impersonate you to Google Calendar.
  • Twitter does everything in the clear, but then your tweets generally go out for all the world to see, so there isn’t really a privacy concern. Twitter uses OAuth signatures, which appear to make it difficult for a third party to create forged tweets.
  • Facebook does everything in the clear, much like Twitter. My Facebook account’s web settings specify full-time encrypted traffic, but this apparently isn’t honored or supported by Facebook’s Android app. Facebook isn’t doing anything like OAuth signatures, so it may be possible to inject bogus posts as well. Also notable: one of the requests we saw going from my phone to the Facebook server included an SQL statement within. Could Facebook’s server have a SQL injection vulnerability? Maybe it was just FQL, which is ostensibly safe.
  • The free version of Angry Birds, which uses AdMob, appears to preserve your privacy. The requests going to the AdMob server didn’t have anything beyond the model of my phone. When I clicked an ad, it sent the (x,y) coordinates of my click and got a response saying to send me to a URL in the web browser.
  • Another game I tried, Galcon, had no network activity whatsoever. Good for them.
  • SoundHound and ShopSaavy transmit your fine GPS coordinates whenever you make a request to them. One of the students typed the coordinates into Google Maps and they nailed me to the proper side of the building I was teaching in.

What options do Android users have, today, to protect themselves against eavesdroppers? Android does support several VPN configurations which you could configure before you hit the road. That won’t stop the unnecessary transmission of your fine GPS coordinates, which, to my mind, neither SoundHound nor ShopSaavy have any business knowing. If that’s an issue for you, you could turn off your GPS altogether, but you’d have to turn it on again later when you want to use maps or whatever else. Ideally, I’d like the Market installer to give me the opportunity to revoke GPS privileges for apps like these.

Instructor note: live demos where you don’t know the outcome are always a dicey prospect. Set everything up and test it carefully in advance before class starts.

What an expert on seals has to say

During the New Jersey voting machines lawsuit, the State defendants tried first one set of security seals and then another in their vain attempts to show that the ROM chips containing vote-counting software could be protected against fraudulent replacement. After one or two rounds of this, Plaintiffs engaged Dr. Roger Johnston, an expert on physical security and tamper-indicating seals, to testify about New Jersey’s insecure use of seals.

In his day job, Roger is a scientist at the Argonne National Laboratory, working to secure (among other things) our nation’s shipments of nuclear materials. He has many years of experience in the scientific study of security seals and their use protocols, as well as physical security in general. In this trial he testified in his private capacity, pro bono.

He wrote an expert report in which he analyzed the State’s proposed use of seals to secure voting machines (what I am calling “Seal Regime #2” and “Seal Regime #3”). For some of these seals, he and his team of technicians have much slicker techniques to defeat these seals than I was able to come up with. Roger chooses not to describe the methods in detail, but he has prepared this report for the public.

What I found most instructive about Roger’s report (including in version he has released publicly) is that he explains that you can’t get security just by looking at the individual seal. Instead, you must consider the entire seal use protocol:


Seal use protocols are the formal and informal procedures for choosing, procuring, transporting, storing, securing, assigning, installing, inspecting, removing, and destroying seals. Other components of a seal use protocol include procedures for securely keeping track of seal serial numbers, and the training provided to seal installers and inspectors. The procedures for how to inspect the object or container onto which seals are applied is another aspect of a seal use protocol. Seals and a tamper-detection program are no better than the seal use protocols that are in place.

He explains that inspecting seals for evidence of tampering is not at all straightforward. Inspection often requires removing the seal—for example, when you pull off an adhesive-tape seal that’s been tampered with, it behaves differently than one that’s undisturbed. A thorough inspection may involve comparing the seal with microphotographs of the same seal taken just after it was originally applied.

For each different seal that’s used, one can develop a training program for the seal inspectors. Because the state proposed to use four different kinds of seals, it would need four different sets training materials. Training all the workers who would inspect the State’s 10,000 voting machines would be quite expensive. With all those seals, just the seal inspections themselves would cost over $100,000 per election.

His report also discusses “security culture.”


“Security culture” is the official and unofficial, formal and informal behaviors, attitudes, perceptions, strategies, rules, policies, and practices associated with security. There is a consensus among security experts that a healthy security culture is required for effective security….

A healthy security culture is one in which security is integrated into everyday work, management, planning, thinking, rules, policies, and risk management; where security is considered as a key issue at all employee levels (and not just an afterthought); where security is a proactive, rather than reactive activity; where security measures are carefully defined, and frequently reviewed and studied; where security experts are involved in choosing and reviewing security strategies, practices, and products; where the organization constantly seeks proactively to understand vulnerabilities and provide countermeasures; where input on potential security problems are eagerly considered from any quarter; and where wishful thinking and denial is deliberately avoided in regards to threats, risks, adversaries, vulnerabilities, and the insider threat….

Throughout his deposition … Mr. Giles [Director of the NJ Division of Elections] indicates that he believes good physical security requires a kind of band-aid approach, where serious security vulnerabilities can be covered over with ad hoc fixes or the equivalent of software patches. Nothing could be further from the truth.

Roger Johnston’s testimony about the importance of seal use protocols—as considered separately from the individual seals themselves—made a strong impression on the judge: in the remedy that the Court ordered, seal use protocols as defined by Dr. Johnston played a prominent role.

The trick to defeating tamper-indicating seals

In this post I’ll tell you the trick to defeating physical tamper-evident seals.

When I signed on as an expert witness in the New Jersey voting-machines lawsuit, voting machines in New Jersey used hardly any security seals. The primary issues were in my main areas of expertise: computer science and computer security.

Even so, when the state stuck a bunch of security seals on their voting machines in October 2008, I found that I could easily defeat them. I sent in a supplement expert report to the Court, explaining how.

Soon after I sent in my report about how to defeat all the State’s new seals, in January 2009 the State told the Court that it was abandoning all those seals, and that it had new seals for the voting machines. As before, I obtained samples of these new seals, and went down to my basement to work on them.

In a day or two, I figured out how to defeat all those new seals.

  • The vinyl tamper-indicating tape can be defeated using packing tape, a razor blade, and (optionally) a heat gun.
  • The blue padlock seal can be defeated with a portable drill and a couple of jigs that I made from scrap metal.
  • The security screw cap can be defeated with a $5 cold chisel and a $10 long-nose pliers, each custom ground on my bench grinder.

For details and pictures, see “Seal Regime #3” in this paper.

The main trick is this: just to know that physical seals are, in general, easy to defeat. Once you know that, then it’s just a matter of thinking about how to do it, and having a pile of samples on which to experiment. In fact, the techniques I describe in my paper are not the only way to defeat these seals, or the best way—not even close. These techniques are what an amateur could come up with. But these seal-defeats were good enough to work just fine when I demonstrated them in the courtroom during my testimony, and they would almost certainly not be detected by the kinds of seal-inspection protocols that most states (including New Jersey) use for election equipment.

(In addition, the commenters on my previous post describe a very simple denial-of-service attack on elections: brazenly cut or peel all the seals in sight. Then what will the election officials do? In principle they should throw out the ballots or data covered by those seals. But then what? “Do-overs” of elections are rare and messy. I suspect the most common action in this case is not even to notice anything wrong; and the second most common is to notice it but say nothing. Nobody wants to rock the boat.)