April 20, 2014

avatar

Some Technical Clarifications About Do Not Track

When I last wrote here about Do Not Track in August, there were just a few rumblings about the possibility of a Do Not Track mechanism for online privacy. Fast forward four months, and Do Not Track has shot to the top of the privacy agenda among regulators in Washington. The FTC staff privacy report released in December endorsed the idea, and Congress was quick to hold a hearing on the issue earlier this month. Now, odds are quite good that some kind of Do Not Track legislation will be introduced early in this new congressional session.

While there isn’t yet a concrete proposal for Do Not Track on the table, much has already been written both in support of and against the idea in general, and it’s terrific to see the issue debated so widely. As I’ve been following along, I’ve noticed some technical confusion on a few points related to Do Not Track, and I’d like to address three of them here.

1. Do Not Track will most likely be based on an HTTP header.

I’ve read some people still suggesting that Do Not Track will be some form of a government-operated list or registry—perhaps of consumer names, device identifiers, tracking domains, or something else. This type of solution has been suggested before in an earlier conception of Do Not Track, and given its rhetorical likeness to the Do Not Call Registry, it’s a natural connection to make. But as I discussed in my earlier post—the details of which I won’t rehash here—a list mechanism is a relatively clumsy solution to this problem for a number of reasons.

A more elegant solution—and the one that many technologists seem to have coalesced around—is the use of a special HTTP header that simply tells the server whether the user is opting out of tracking for that Web request, i.e. the header can be set to either “on” or “off” for each request. If the header is “on,” the server would be responsible for honoring the user’s choice to not be tracked. Users would be able to control this choice through the preferences panel of the browser or the mobile platform.

2. Do Not Track won’t require us to “re-engineer the Internet.”

It’s also been suggested that implementing Do Not Track in this way will require a substantial amount of additional work, possibly even rising to the level of “re-engineering the Internet.” This is decidedly false. The HTTP standard is an extensible one, and it “allows an open-ended set of… headers” to be defined for it. Indeed, custom HTTP headers are used in many Web applications today.

How much work will it take to implement Do Not Track using the header? Generally speaking, not too much. On the client-side, adding the ability to send the Do Not Track header is a relatively simple undertaking. For instance, it only took about 30 minutes of programming to add this functionality to a popular extension for the Firefox Web browser. Other plug-ins already exist. Implementing this functionality directly into the browser might take a little bit longer, but much of the work will be in designing a clear and easily understandable user interface for the option.

On the server-side, adding code to detect the header is also a reasonably easy task—it takes just a few extra lines of code in most popular Web frameworks. It could take more substantial work to program how the server behaves when the header is “on,” but this work is often already necessary even in the absence of Do Not Track. With industry self-regulation, compliant ad servers supposedly already handle the case where a user opts out of their behavioral advertising programs, the difference now being that the opt-out signal comes from a header rather than a cookie. (Of course, the FTC could require stricter standards for what opting-out means.)

Note also that contrary to some suggestions, the header mechanism doesn’t require consumers to identify who they are or otherwise authenticate to servers in order to gain tracking protection. Since the header is a simple on/off flag sent with every Web request, the server doesn’t need to maintain any persistent state about users or their devices’ opt-out preferences.

3. Microsoft’s new Tracking Protection feature isn’t the same as Do Not Track.

Last month, Microsoft announced that its next release of Internet Explorer will include a privacy feature called Tracking Protection. Mozilla is also reportedly considering a similar browser-based solution (although a later report makes it unclear whether they actually will). Browser vendors should be given credit for doing what they can from within their products to protect user privacy, but their efforts are distinct from the Do Not Track header proposal. Let me explain the major difference.

Browser-based features like Tracking Protection basically amount to blocking Web connections from known tracking domains that are compiled on a list. They don’t protect users from tracking by new domains (at least until they’re noticed and added to the tracking list) nor from “allowed” domains that are tracking users surreptitiously.

In contrast, the Do Not Track header compels servers to cooperate, to proactively refrain from any attempts to track the user. The header could be sent to all third-party domains, regardless of whether the domain is already known or whether it actually engages in tracking. With the header, users wouldn’t need to guess whether a domain should be blocked or not, and they wouldn’t have to risk either allowing tracking accidentally or blocking a useful feature.

Tracking Protection and other similar browser-based defenses like Adblock Plus and NoScript are reasonable, but incomplete, interim solutions. They should be viewed as complementary with Do Not Track. For entities under FTC jurisdiction, Do Not Track could put an effective end to the tracking arms race between those entities and browser-based defenses—a race that browsers (and thus consumers) are losing now and will be losing in the foreseeable future. For those entities outside FTC jurisdiction, blocking unwanted third parties is still a useful though leaky defense that maintains the status quo.

Information security experts like to preach “defense in depth” and it’s certainly vital in this case. Neither solution fully protects the user, so users really need both solutions to be available in order to gain more comprehensive protection. As such, the upcoming features in IE and Firefox should not be seen as a technical substitute for Do Not Track.

——

To reiterate: if the technology that implements Do Not Track ends up being an HTTP header, which I think it should be, it would be both technically feasible and relatively simple. It’s also distinct from recent browser announcements about privacy in that Do Not Track forces server cooperation, while browser-based defenses work alone to fend off tracking.

What other technical issues related to Do Not Track remain murky to readers? Feel free to leave comments here, or if you prefer on Twitter using the #dntrack tag and @harlanyu.

Comments

  1. dmc says:

    “In contrast, the Do Not Track header compels servers to cooperate, to proactively refrain from any attempts to track the user.” Not exactly…there is no technical compelling. We have to rely on the advertising companies to add “just a few extra lines of code in most popular Web frameworks”.

    Wouldn’t there also need to be legal penalties in place for violations, in order for this to work? (To encourage advertising companies to put in those lines of code.) Is this in the works?

    • harlanyu says:

      Of course, Do Not Track needs a regulatory framework with effective enforcement mechanisms. This is the ongoing policy debate in Washington, whether Congress should give the FTC authority to define and enforce DNT regulations and what these regulation look like.

  2. Barry says:

    It’s bad enough that you can’t run a web forum these days without having to worry about inadvertently violating COPPA. Adding more legal restrictions such as “do not track” only serves to make it that much more dangerous to establish a web presence without appreciably improving safety/privacy for web users.

    The client-side tracking-blocking technologies you mention are a lot more reliable, because you aren’t trusting the server to honor your preferences and because it transcends US legal boundaries. Regular updates to the blacklist would clearly be required, but there are already blacklist services for blocking spam and malware, so this shouldn’t be a huge problem.

    • harlanyu says:

      I agree that regulators need to be careful not to institute overburdensome regulations, and I think a reasonable framework is possible. For example, Do Not Track would probably only apply to third party entities, i.e. only those that can track you across a large number of websites. There might also be exceptions for small website and service operators.

      I’ll write more soon about why I think blocking is insufficient as a substitute for Do Not Track. I think it’s useful but not enough.

  3. Joshua Tauberer says:

    Hey, Harlan. I’m guessing that “persistence” here isn’t supposed to be the same as a persistent cookie? I’m not expecting do not track to prevent, e.g., websites from remembering your login. What exactly is do not track supposed to prevent?

    • harlanyu says:

      Hey Josh. DNT is different from not accepting cookies. Cookies are just one mechanism by which servers are able to track users online. There are lots of other methods (see: evercookie and browser fingerprinting) and it’s difficult if not impossible for users to consistently fend all of them off. As you mentioned, cookies are also often used to remember site logins, which makes it doubly difficult/confusing for users to manage which domains should and shouldn’t be allowed to set cookies. DNT can in some sense “separate” at the server-side the login and tracking functions of cookies.

      The purpose of DNT is to give users a simple mechanism to choose whether or not they will be tracked by third parties across many different websites. Users right now don’t have a meaningful way to express this choice, and DNT intends to solve that.

      • Anonymous says:

        Isn’t there a fairly simple cookie-blocking rule that would stop most tracking without interfering with logins, and doesn’t require a central blocklist to be maintained?

        And that’s: if the browser fetches a page from http://www.foo.com, say, then for the duration of that page load, only *.foo.com cookies can be set. If the page embeds an image from http://www.adserver.com and http://www.adserver.com tries to set a cookie, the cookie is ignored.

        That should pretty much kill all cross-domain tracking via cookies without hurting logins, which typically are at the same domain you’re visiting.

        The two other tracking methods I am aware of can be dealt with similarly.

        1. You login to a site and that site has embeds on lots of other sites (e.g. Facebook).

        The user should probably be able to opt out of those on a case-by-case basis. This can be handled the reverse of the above: if you visit http://www.foo.com and the browser is about to fetch a page element from a different domain and has a cookie *already set* for that domain that it’s about to send with the request, it sees if the user has decided to allow that domain to track them, deny, or neither yet. If neither, the user is prompted. If deny, the cookie is not deleted but is also not sent with the request. If allow, the cookie is sent with the request.

        2. “Browser fingerprinting”.

        This is a simple matter of spoofing. Browsers can randomize non-essential header contents and header sequence, or generate only the minimal headers necessary for the server to complete the request, or masquerade as a fixed mix of browsers.

        Ultimately, client-side solutions like these must surely be more reliable than telling the server “please, please don’t track me. Please?” and then hoping they have a heart.

        • SecurityGeek says:

          Are you sure that blocking cross-domain cookies in this way won’t break important aspects of the web? Won’t it break mashups? Will things like “Like” buttons continue to work? What about single sign-on and the like?

          Any time you consider these kinds of changes, you ahve to ask: What is the compatibility cost? If it breaks more than a tiny fraction of web sites, browser manufacturers are typically very reluctant to consider such changes — because it can cause their users to switch to other browsers, and thus lose them market share.

          • Anonymous says:

            Are you sure that blocking cross-domain cookies in this way won’t break important aspects of the web? Won’t it break mashups? Will things like “Like” buttons continue to work? What about single sign-on and the like?

            “Like” buttons would rely on allowing third party cookies to be sent back, rather than set in the first place. I covered that.

            Single sign-on is evil. Big Brother Inside evil.

          • Andrew says:

            The huge majority of the web works fine when you block cross domain cookies.

            However, you can still be easily tracked by image beacons, flash cookies, browser finger printing.

  4. Jim Brock says:

    It’s hard to argue with the idea that consumers should have the choice of a universal opt-out that works as simply as a browser header. No doubt it’s even better when used in tandem with actual blocking. But here are my reservations about making this approach the centerpiece of a privacy framework for online tracking:

    1. At PrivacyChoice we’ve tested the opt-out processes for hundreds of tracking companies, and have seen up close a wide variation in how well those are maintained and honored by data companies, even among top-tier companies. Based on that experience, I’m very skeptical about back-end compliance based only on a header. Unfortunately, tracking companies will continue to underinvest in their privacy systems, and the do-not-track header may provide consumers with a false sense of control relative to true blocking.

    2. Given the momentum behind IE9′s approach (the most popular browser), and the ready availability of tracking protection lists based on our database, Adblock or many others, browser-based control is actually possible right now, without the need for a legislative mandate. The IE9 approach is truly different from what has come before because it can be implemented from a webpage, without a browser add-on, and in the context of the notice-and-choice experience already being embraced by the industry.

    3. Unlike tracking protection lists, a universal header does not allow for distinctions among tracking companies or purposes. This matters because, for example, retargeting by known brands may be desirable to you while general tracking by faceless ad networks is not. In that sense, a universal header actually provides less choice for consumers, and does not allow them to see the value-exchange underlying their choices. By enabling these dimensions in the consumer’s choice, there’s a better chance that marketers and websites will actually present the options in context (let us track, get free content); whereas they would be far less enthusiastic about presenting a generic and permanent do-not-track header approach, which for them, has no saving grace.

    • rp says:

      It would be really, really easy, once you’ve gotten that far, to extend a generic header with a binary choice into a more-specific header with lots of choices. The better shouldn’t be the enemy of the good.

  5. 327 says:

    I’m a bit concerned with the USA centric view this all has. Changing the browser’s behaviour to comply with USA law and regulations without considering the international perspective seems odd. As a plugin, sure, go for it. But as a central part of the browser itself? What if China passed a law stating that all web history should be sent to their government – would you consider making that an option in firefox?

    And in terms of effectiveness – policing the trackers and enforcing the regulations will be difficult enough among USA based companies, and clearly not even required for international ones. What’s to stop USA based advertisers simply moving operations offshore?

    • TechGeek says:

      I too am wondering the same thing. Do Not Track is only as effective as enforcement of its requirements are. But I don’t see how US agencies can compel international companies to abide by these restrictions. And if the US can’t compel non-US companies to follow the Do Not Track rules, doesn’t this just mean that all the tracking operations will move offshore?

      What happens when a US company embeds an iframe or an inline image in their page, which points to some other company, and the other company happens to be incorporated overseas and happens to track users in violation of the Do Not Track mandate? How will the FTC handle that?

      What if offshore companies track users, and then sell information about those users to US companies? Will the FTC forbid that? Will the FTC forbid US companies to embed ads from offshore companies that aren’t subject to US law? How is all this going to work, anyway?

      As a techy, the technical side seems easy: just adding a HTTP header is trivial. But I don’t understand how the enforcement/regulation side is going to work. Is this all a pipe dream? Is there even a hope of a solution that will be effective, given the international nature of the Internet?

      • rp says:

        But from a regulatory point of view it would be pretty simple: if you’re a US company, you’re responsible for the tracking behavior of the people you contract with. There are pretty clear rules about transborder data flow for other situations, and nothing (except a pile of lobbying) preventing them for this one. And of course, for anybody in a jurisdiction where there are actual privacy laws, the do-not-track header is about as clear an indicator of the purposes for which data may be used as you could ask for.

        The question is more about what teeth a regulation would actually have, e.g. whether there would be a private right of action.

  6. Jeff S. says:

    I’m worried about old software. Sending a specific http header to indicate do-not-track status requires a new browser version.

    When we’re all running IE10 and using our shiny new iPhone 6, maybe this is less of an issue, but for those of us who can’t or won’t upgrade our software, we don’t have the capability to send this header, and therefore can’t indicate our do-not-track preference.

    Software lasts way longer than we expect. IE6 refuses to die. I will never upgrade my ipod touch beyond ios 3.1.3. What about all those non-upgradable Android phones? Attrition is not the answer.

    Instead, the “do-not-track” header should be more properly called “tracking-ok” — the default state is to not track unless explicitly authorized by the user.