In this installment of the “No Boundaries” series we show how wholesale collection of user interactions by third-party analytics and session replay scripts cause inadvertent collection of passwords.
By Steve Englehardt, Gunes Acar and Arvind Narayanan
Following the recent report that Mixpanel, a popular analytics provider, had been inadvertently collecting passwords that users typed into websites, we took a deeper look [1]. While Mixpanel characterized it as a “bug, plain and simple” — one that it had fixed — we found that:
- Mixpanel continues to grab passwords on some sites, even with the patched version of its code.
- The problem is not limited to Mixpanel; also affected are session replay scripts, which we revealed earlier to be scooping up various other types of sensitive information.
- There is no foolproof way for these third party scripts to prevent password collection, given their intended functionality. In some cases, password collection happens due to extremely subtle interactions between code from different entities.
Overall, we think that the approach of third-party scripts collecting the entirety of web pages or form inputs, and attempting to filter out sensitive information is incompatible with user security and privacy.
Password leaks are not limited to Mixpanel
In our research we found password leaks to four different third-party analytics providers across a number of websites. The sources are numerous: several variants of a “Show Password” feature added by site owners, an unexpected interaction with an unrelated third-party script, unanticipated changes to page structure by browser extensions, and even a bug in privacy tools of one of the analytics libraries. However, the underlying cause is the same: wholesale collection of user input data, with protection provided by set of blacklist-based heuristics to filter password fields. We argue that this heuristic approach is bound to fail, and provide a list of examples in which it does.
This summary is provided not as an exhaustive list of all possible vulnerabilities, but rather as examples of how things can go wrong. A detailed description of each vulnerability and the vendor response is available in the Appendix.