November 25, 2017

Archives for September 2017

I never signed up for this! Privacy implications of email tracking

In this post I discuss a new paper that will appear at PETS 2018, authored by myself, Jeffrey Han, and Arvind Narayanan.

What happens when you open an email and allow it to display embedded images and pixels? You may expect the sender to learn that you’ve read the email, and which device you used to read it. But in a new paper we find that privacy risks of email tracking extend far beyond senders knowing when emails are viewed. Opening an email can trigger requests to tens of third parties, and many of these requests contain your email address. This allows those third parties to track you across the web and connect your online activities to your email address, rather than just to a pseudonymous cookie.

Illustrative example. Consider an email from the deals website LivingSocial (see details of the example email). When the email is opened, client will make requests to 24 third parties across 29 third-party domains.[1] A total of 10 third parties receive an MD5 hash of the user’s email address, including major data brokers Datalogix and Acxiom. Nearly all of the third parties (22 of the 24) set or receive cookies with their requests. In a webmail client the cookies are the same browser cookies used to track users on the web, and indeed many major web trackers (including domains belonging to Google, comScore, Adobe, and AOL) are loaded when the email is opened. While this example email has a large number of trackers relative to the average email in our corpus, the majority of emails (70%) embed at least one tracker.

How it works. Email tracking is possible because modern graphical email clients allow rendering a subset of HTML. JavaScript is invariably stripped, but embedded images and stylesheets are allowed. These are downloaded and rendered by the email client when the user views the email.[2] Crucially, many email clients, and almost all web browsers, in the case of webmail, send third-party cookies with these requests. The email address is leaked by being encoded as a parameter into these third-party URLs.

Diagram showing the process of tracking with email address

When the user opens the email, a tracking pixel from “tracker.com” is loaded. The user’s email address is included as a parameter within the pixel’s URL. The email client here is a web browser, so it automatically sends the tracking cookies for “tracker.com” along with the request. This allows the tracker to create a link between the user’s cookie and her email address. Later, when the user browses a news website, the browser sends the same cookie, and thus the new activity can be connected back to the email address. Email addresses are generally unique and persistent identifiers. So email-based tracking can be used for targeting online ads based on offline activity (say, to shoppers who used a loyalty card linked to an email address) and for linking different devices belonging to the same user.

[Read more…]

What our students found when they tried to break their bubbles

This is the second part of a two-part series about a class project on online filter bubbles. In this post, where we focus on the results. You can read more about our pedagogical approach and how we carried out the project here.

By Janet Xu and Matthew J. Salganik

This past spring, we taught an undergraduate class on social networks at Princeton University which involved a multi-week, student-led collective class project about algorithmic filter bubbles on Facebook. We wanted to expose students to the process of doing real research, and filter bubbles seemed like an attractive topic because they are interesting, important, and tricky to study. The project—which we called Breaking Your Bubble—had three steps: measuring your bubble, breaking your bubble, and studying the effects. In short, all 130 undergraduates in the class measured their Facebook News Feed for four weeks—recording the slant (liberal, neutral, or conservative) of the political posts that they saw. Then, starting in the second week of the project, students implemented procedures they had developed in order to change their News Feeds, with the goal of achieving a “balanced diet” that matched the baseline distribution of what is being shared on Facebook. Students also came up with public opinion questions for a big class survey, which they took at both the beginning and the end of the project. You can read more about what exactly we did, how it worked, and what we’d do differently next time here.

Though our primary goal was to teach students about doing research, we also learned some surprising things about the Facebook News Feed from the aggregated student results.

[Read more…]

Breaking your bubble

This is the first part of a two-part series about a class project on online filter bubbles. In this post, we talk about our pedagogical approach and how we carried out the project. To read more about the results of the project, go to Part Two.

By Janet Xu and Matthew J. Salganik

The 2016 US presidential election dramatically increased public attention to online filter bubbles and their impacts on society. These online filter bubbles—roughly, personalized algorithms that over-expose people to information that is consistent with their prior beliefs—are interesting, important, and tricky to study. These three characteristics made online filter bubbles an ideal topic for our undergraduate social network class. In this post, we will describe a multi-week, student-led project on algorithmic filter bubbles that we ran with 130 students. We’ll describe what we did, how it worked, and what we’d do differently next time. You can read about what we learned from the results — which turned out to be pretty surprising — here.

[Read more…]