Last week, I wrote about the privacy concerns surrounding Phorm, an online advertising company who has teamed up with British ISPs to track user Web behavior from within their networks. New technical details about its Webwise system have since emerged, and it’s not just privacy that now seems to be at risk. The report exposes a system that actively degrades user experience and alters the interaction with content providers. Even more importantly, the Webwise system is a clear violation of the sacred end-to-end principle that guides the core architectural design of the Internet.
Phorm’s system does more than just passively gain “access to customers’ browsing records” as previously suggested. Instead, they plan on installing a network switch at each participating ISP that actively interferes with the user’s browsing session by injecting multiple URL redirections before the user can retrieve the requested content. Sparing you most of the nitty-gritty technical details, the switch intercepts the initial HTTP request to the content server to check whether a Webwise cookie–containing the user’s randomly-assigned identifier (UID)– exists in the browser. It then impersonates the requested server to trick the browser into accepting a spoofed cookie (which I will explain later) that contains the same UID. Only then will the switch forward the request and return the actual content to the user. Basically, this amounts to a big technical hack by Phorm to set the cookies that track users as they browse the Web.
In all, a user’s initial request is redirected three times for each domain that is contacted. Though this may not seem like much, this extra layer of indirection harms the user by degrading the overall browsing experience. It imposes an unnecessary delay that will likely be noticeable by users.
The spoofed cookie that Phorm stores on the user’s browser during this process is also a highly questionable practice. Generally speaking, a cookie is specific to a particular domain and the browser typically ensures that a cookie can only be read and written by the domain it belongs to. For example, data in a yahoo.com cookie is only sent when you contact a yahoo.com server, and only a yahoo.com server can put data into that cookie.
But since Phorm controls the switch at the ISP, it can bypass this usual guarantee by impersonating the server to add cookies for other domains. To continue the example, the switch (1) intercepts the user’s request, (2) pretends to be a yahoo.com server, and (3) injects a new yahoo.com cookie that contains the Phorm UID. The browser, believing the cookie to actually be from yahoo.com, happily accepts and stores it. This cookie is used later by Phorm to identify the user whenever the user visits any page on yahoo.com.
Cookie spoofing is problematic because it can change the interaction between the user and the content-providing site. Suppose a site’s privacy policy promises the user that it does not use tracking cookies. But because of Phorm’s spoofing, the browser will store a cookie that (to the user) looks exactly like a tracking cookie from the site. Now, the switch typically strips out this tracking cookie before it reaches the site, but if the user moves to a non-Phorm ISP (say at work), the cookie will actually reach the site in violation of its stated privacy policy. The cookie can also cause other problems, such as a cookie collision if the site cookie inadvertently has the same name as the Phorm cookie.
Disruptive activities inside the network often create these sort of unexpected problems for both users and websites, which is why computer scientists are skeptical of ideas that violate the end-to-end principle. For the uninitiated, the principle, in short, states that system functionality should almost always be implemented at the end hosts of the network, with a few justifiable exceptions. For instance, almost all security functionality (such as data encryption and decryption) is done by end users and only rarely by machines inside the network.
The Webwise system has no business being inside the network and has no role in transporting packets from one end of the network to the other. The technical Internet community has been worried for years about the slow erosion of the end-to-end principle, particularly by ISPs who are looking to further monetize their networks. This principle is the one upon which the Internet is built and one which the ISPs must uphold. Phorm’s system, nearly in production, is a cogent realization of this erosion, and ISPs should keep Phorm outside the gate.
IANAL, and I’m certainly not up-to-speed on UK copyright law, but…
Post-Berne the aggregation and storage of my browsing pattern should automatically be a copyright work *owned by me*. Neither opt-in nor opt-out nor ISP licenses should be able to change this fact (in the US, copyright transfer rules are more stringent than real-estate transfer rules, as we’ve seen with SCO vs. Novell–and this certainly is not a work-for-hire!).
So isn’t Phorm’s action (creating and distributing derivatives of a recording of my actions) felony copyright infringement for profit? And isn’t Phorm’s entire business tied up in that?
I’m not responsible for this website but I just stumbled on it and thought it deserved a link here: http://www.badphorm.co.uk/
Eddie, “In other matters, if this business model does not immediately crash, it opens a door to even more intrusive data gathering. While Phorm may be on the up-and-up today, what happens when they go public/are taken over/are bought out and the new management would like to store more of the data that they have access to today sitting in the middle of the connection?”
Phorm are the re-spun “121Media it distributed a program called PeopleOnPage, which was classified as spyware by F-Secure.”
http://en.wikipedia.org/wiki/Phorm#Company_history
anso responsible for the People OnPage rootkit.
ALexander is working on a paper outlining the old BT trials and the legal arguments ,it should be finished soo ,but you can read his incompleate PDF now
http://www.paladine.org.uk/phorm_paper.pdf
and follow and contibute to the talk regarding Phorm in the Cable Forum thread
http://www.cableforum.co.uk/board/12/33628733-virgin-media-phorm-webwise-adverts-updated-page-201.html#post34526512
while im here do forget NebuAd, they are also have offices in the UK and are just waiting on the outcome of Phorms plans to see were the legal points lead.
FYI, NebuAd are also a so called ex-adware company re-badged and re-branded aka Claria_Corporation, formerly Gator Corporation
and many more.
http://en.wikipedia.org/wiki/Claria_Corporation
just to reflesh the old memory regarding gator and make note of NebuAd’s pedigree.
http://www.stopscum.com/archives/gator_claria_vista_marketing_services_and_behaviorlinkcom_some_new_names_but_the_same_old_spyware_parent.html
In a way this is a good forward step for the crypto fans. Once the trend of mangling HTTP connections gets going, web site providers will get annoyed by the fact that their users don’t get to see the page in its original form. The more heavy-handed the mangling, the more annoyed both the users and the content providers will become.
What is the logical response from a web site provider when they get annoyed about mangling? Switch the site over to SSL and redirect the users to HTTPS connections. Gradually more and more sites will enable crypto until that becomes the new normal thing.
Something technologically similar is happening (at least) in two german GPRS network. There is a abit of javascrit injected, and images are by default reduced in size and quality. This may be a reasonable thing to do on GPRS, but it also poses the changing-provider problem. The javascript is located at the IP address 1.2.3.4 (really, literally!), and when you reenter a page with a different provider you get connection errors.
I am now using an SSL tunnel to my home proxy to avoid that stuff, but not everyone can do that.
Also, the name ‘phorm’ almost looks like an april’s fool joke, being so close to the thing that propels all new technology into the marketplace.
@Tel:
> So do you also feel morality bound to randomly buy the products that you see advertised?
A more accurate comparison would be: Do you also feel morally bound to watch television commercials (or listen to radio commercials)? (Or perhaps “not edit the commercials out of TV/radio recorded shows.”)
In other matters, if this business model does not immediately crash, it opens a door to even more intrusive data gathering. While Phorm may be on the up-and-up today, what happens when they go public/are taken over/are bought out and the new management would like to store more of the data that they have access to today sitting in the middle of the connection? For example, *today* they don’t overwrite others’ ads. Who’s to say that they will always refrain from doing so, if offered enough money from someone who wants to displace their competitor’s ads?
@Brian: Good point, session hijacking is definitely possible. Phorm seems to be aware that webmail sites are more sensitive than others. The Clayton paper says that they have a webmail blacklist of “more than 25 sites”– a rather weak attempt. It also doesn’t mention what they do for other login-based non-webmail sites.
The most ludicrous thing is that there is already a way to do something similar at the application layer, where it belongs, instead of at the physical network layer. It’s called SOCKS. The ISPs could just set up web proxies that feed data to Phorm, and that their installation software will enable by default. Users could opt out by configuring their browsers not to use their ISP’s proxy, or by manually setting up their net connection instead of using their ISP’s for-dummies installer disk.
Why use a network-level switch? It doesn’t make much sense … unless they want to (eventually, if not immediately) do more, such as monitor or meddle with non-HTTP traffic.
And no doubt all of Phorm’s machines will be utterly and absolutely secure, running no code that the company’s programmer didn’t put there, leaking no information to the rest of the world, and all of the code will function perfectly without holes under high load.
I can already see some pretty likely ways for third parties (including at least the sites targeted and anyone else serving content to the same page) to determine whether a particular user has opted into or out of phorm, and what ads are being served to them (which in turn reveals the kind of browsing-history information phorm claims to keep secret.
@ Derek
So do you also feel morality bound to randomly buy the products that you see advertised? After all, you should have some grasp of the fact that many advertisers depend on the revenue stream created by selling something in order to keep buying advertising space on the sites you want to support.
Ahhh, so it’s amazingly easy to detect, and turns a normal link into a complete dog. Well no problems then, the first ISP to implement this will be publicly flayed by the marketplace and that will be then end of that.
Would that all intrusions into our privacy were so self regulating.
This is crazy! I could comment on so much of this but others have done so well enough, but the one thing I wanted to throw out there at the moment is the possibility of sites deliberately injecting Phorm cookie data to make it look like the user is somehow doing bad things or generally mislead the tracking system cause it to feed them weird advertisements (maybe of a sexual nature or something else). Not sure how easy it if if its just a tracking ID but certainly sites could mess with it any way they wanted and probably not at all be liable for any damage. There’s even the remote possibility of exploiting phorm via users cookies if malicious data could be injected. That last one is a bit more far fetched but still, this whole thing totally breaks the whole idea of domain based cookies. I can only imagine what fun/nastiness could ensue.
My question is what about “Web 2.0” services like Gmail, Yahoo, Facebook, etc. that encrypt usernames and passwords but keep track of users’ session information via cookies.
These cookie files are not encrypted (assuming default non-SSL connection), which means that anyone who is monitoring the network traffic and intercepts one of those cookie files can log in as the victim.
This is exactly what Phorm is doing. While I’m guessing their business model doesn’t intend to grab the keys to users’ accounts, it does. Un-kewl.
A Fischer: FIPR says that Phorm could be violating the Regulation of Investigatory Powers Act in the UK, but I don’t know of a similar law in the US.
Derek: They don’t overrun existing ad space with their own ads. They act like a traditional ad server and their ads only appear on participating websites. The choice of which ad to show is informed by the collected behavioral data.
Aside from all the obvious points you’ve brouht up here – this strikes me as being similar in intent [at least] to how in Canada [from whence I hail], the CRTC mandates that local broadcasters and cable companies insert local commercials into live TV. The idea is that local/national advertisers get their exposure, while local broadcasters get their own revenue stream from advertising – while, of course, the customers watching TV get their programme along with ads that should be more relevant to them and where they live.
I’m wondering how far off we are before some [clearly not-so] clever policymaker with the CRTC or other countries’ like bodies pick up on this [or other methods similar in concept and/or intent] as a way to divert advertising revenue to locals, possibly with a big pat-on-the-back from local/national government for doing so. Or if, perhaps, it’s already happening elsewhere with my being unaware of it.
I personally tend to block a great deal of the more intrusive forms of web advertising, but choose to leave alone the less in-your-face “content” because I have some grasp of the fact that many site owners rely on the revenue stream created by ads in order to keep their sites running. Someone has to pay for the bandwidth, storage, et al – and this idea [assuming it overruns all available ad space with its own] takes a big ole’ dump all over the entire concept, potentially leaving site owners in the cold. Maybe someone will wake up when a lot of the smaller ad-supported sites start dying off as a result.
This is awful. That’s way, way, way too many moving parts between the browser and the server. It’s one of those “what could possibly go wrong?” situations, with a dozen obvious gotchas and probably a dozen more that will be discovered once the system is deployed.
Egads, way worse than I thought.
So is this any different than a “man in the middle” attack? Is this even legal in the US?