October 16, 2018

No boundaries for Facebook data: third-party trackers abuse Facebook Login

by Steven Englehardt [0], Gunes Acar, and Arvind Narayanan

So far in the No boundaries series, we’ve uncovered how web trackers exfiltrate identifying information from web pages, browser password managers, and form inputs.

Today we report yet another type of surreptitious data collection by third-party scripts that we discovered: the exfiltration of personal identifiers from websites through “login with Facebook” and other such social login APIs. Specifically, we found two types of vulnerabilities [1]:

  • seven third parties abuse websites’ access to Facebook user data
  • one third party uses its own Facebook “application” to track users around the web.

 

Vulnerability 1: Third parties piggyback on Facebook access granted to websites

Diagram of third-party script accessing Facebook API

When a user clicks “Login with Facebook”, they will be prompted to allow the website they’re visiting to access some of their Facebook profile information [2]. Even after Facebook’s recent moves to lock down the feature, websites can request the user’s email address and  “public profile” (name, age range, gender, locale, and profile photo) without triggering a manual review by Facebook. Once the user allows access, any third-party Javascript embedded in the page, such as tracker.com in the figure above, can also retrieve the user’s Facebook information as if they were the first party [3].

[Read more…]

No boundaries for credentials: New password leaks to Mixpanel and Session Replay Companies

In this installment of the “No Boundaries” series we show how wholesale collection of user interactions by third-party analytics and session replay scripts cause inadvertent collection of passwords.
By Steve Englehardt, Gunes Acar and Arvind Narayanan

Following the recent report that Mixpanel, a popular analytics provider, had been inadvertently collecting passwords that users typed into websites, we took a deeper look [1]. While Mixpanel characterized it as a “bug, plain and simple” — one that it had fixed — we found that:

  • Mixpanel continues to grab passwords on some sites, even with the patched version of its code.
  • The problem is not limited to Mixpanel; also affected are session replay scripts, which we revealed earlier to be scooping up various other types of sensitive information.
  • There is no foolproof way for these third party scripts to prevent password collection, given their intended functionality. In some cases, password collection happens due to extremely subtle interactions between code from different entities.

Overall, we think that the approach of third-party scripts collecting the entirety of web pages or form inputs, and attempting to filter out sensitive information is incompatible with user security and privacy.

Password leaks are not limited to Mixpanel

In our research we found password leaks to four different third-party analytics providers across a number of websites. The sources are numerous: several variants of a “Show Password” feature added by site owners, an unexpected interaction with an unrelated third-party script, unanticipated changes to page structure by browser extensions, and even a bug in privacy tools of one of the analytics libraries. However, the underlying cause is the same: wholesale collection of user input data, with protection provided by set of blacklist-based heuristics to filter password fields. We argue that this heuristic approach is bound to fail, and provide a list of examples in which it does.

This summary is provided not as an exhaustive list of all possible vulnerabilities, but rather as examples of how things can go wrong. A detailed description of each vulnerability and the vendor response is available in the Appendix.

[Read more…]

Website operators are in the dark about privacy violations by third-party scripts

by Steven Englehardt, Gunes Acar, and Arvind Narayanan.

Recently we revealed that “session replay” scripts on websites record everything you do, like someone looking over your shoulder, and send it to third-party servers. This en-masse data exfiltration inevitably scoops up sensitive, personal information — in real time, as you type it. We released the data behind our findings, including a list of 8,000 sites on which we observed session-replay scripts recording user data.

As one case study of these 8,000 sites, we found health conditions and prescription data being exfiltrated from walgreens.com. These are considered Protected Health Information under HIPAA. The number of affected sites is immense; contacting all of them and quantifying the severity of the privacy problems is beyond our means. We encourage you to check out our data release and hold your favorite websites accountable.

Student data exfiltration on Gradescope

As one example, a pair of researchers at UC San Diego read our study and then noticed that Gradescope, a website they used for grading assignments, embeds FullStory, one of the session replay scripts we analyzed. We investigated, and sure enough, we found that student names and emails, student grades, and instructor comments on students were being sent to FullStory’s servers. This is considered Student Data under FERPA (US educational privacy law). Ironically, Princeton’s own Information Security course was also affected. We notified Gradescope of our findings, and they removed FullStory from their website within a few hours.
[Read more…]