April 24, 2024

Bad Phorm on Privacy

Phorm, an online advertising company, has recently made deals with several British ISPs to gain unprecedented access to every single Web action taken by their customers. The deals will let Phorm track search terms, URLs and other keywords to create online behavior profiles of individual customers, which will then be used to provide better targeted ads. The company claims that “No private or personal information, or anything that can identify you, is ever stored – and that means your privacy is never at risk.” Although Phorm might have honest intentions, their privacy claims are, at best, misleading to customers.

Their privacy promise is that personally-identifiable information is never stored, but they make no promises on how the raw logs of search terms and URLs are used before they are deleted. It’s clear from Phorm’s online literature that they use this sensitive data for ad delivery purposes. In one example, they claim advertisers will be able to target ads directly to users who see the keywords “Paris vacation” either as a search or within the text of a visited webpage. Without even getting to the storage question, users will likely perceive Phorm’s access and use of their behavioral data as a compromise of their personal privacy.

What Phorm does store permanently are two pieces of information about each user: (1) the “advertising categories” that the user is interested in and (2) a randomly-generated ID from the user’s browser cookie. Each raw online action is sorted into one or more categories, such as “travel” or “luxury cars”, that are defined by advertisers. The privacy worry is that as these categories become more specific, the behavioral profiles of each user becomes ever more precise. Phorm seems to impose no limit on the specificity of these defined categories, so for all intents and purposes, these categories over time will become nearly identical to the search terms themselves. Indeed, they market their “finely tuned” service as analogous to typical keyword search campaigns that advertisers are already used to. Phorm has a strong incentive to store arbitrarily specific interest categories about each user to provide optimally targeted ads, and thus boost the profits of their advertising business.

The second protection mechanism is a randomly-generated ID number stored in a browser cookie that Phorm uses to “anonymously” track a user as she browses the web. This ID number is stored with the list of the interest categories collected for that user. Phorm should be given credit for recognizing this as more privacy-protecting than simply using the customer’s name or IP address as an identifier (something even Google has disappointingly failed to recognize). But from past experience, these protections are unlikely to be enough. The storage of random user IDs mapped to keywords mirroring actual search queries is highly reminiscent of the AOL data fiasco from 2006, where AOL released “anonymized” search histories containing 20 million keywords. It turned out to be easy to identify the name of specific individuals based solely on their search history.

In the least, the company’s employees will be able to access an AOL-like dataset about the ISP’s customers. Granted, distinguishing whether particular datasets as personally-identifiable or not is a notoriously difficult problem and subject to further research. But it’s inaccurate for Phorm to claim that personally-identifiable information is not being stored and to promise users that their privacy is not at risk.