June 29, 2016

avatar

A Peek at A/B Testing in the Wild

[Dillon Reisman was previously an undergraduate at Princeton when he worked on a neat study of the surveillance implications of cookies. Now he’s working with the WebTAP project again in a research + engineering role. — Arvind Narayanan]

In 2014, Facebook revealed that they had manipulated users’ news feeds for the sake of a psychology study looking at users’ emotions. We happen to know about this particular experiment because it was the subject of a publicly-released academic paper, but websites do “A/B testing” every day that is completely opaque to the end-user. Of course, A/B testing is often innocuous (say, to find a pleasing color scheme), but the point remains that the user rarely has any way of knowing in what ways their browsing experience is being modified, or why their experience is being changed in particular.

By testing websites over time and in a variety of conditions we could hope to discover how users’ browsing experience is manipulated in not-so-obvious ways. But one third-party service actually makes A/B testing and user tracking human-readable — no reverse-engineering or experimentation necessary! This is the widely-used A/B testing provider Optimizely; Jonathan Mayer had told us it would be an interesting target of study.* Their service is designed to expose in easily-parsable form how its clients segment users and run experiments on them directly in the JavaScript they embed on websites. In other words, if example.com uses Optimizely, the entire logic used by example.com for A/B testing is revealed to every visitor of example.com.

That means that the data collected by our large-scale web crawler OpenWPM contains the details of all the experiments that are being run across the web using Optimizely. In this post I’ll show you some interesting things we found by analyzing this data. We’ve also built a Chrome extension, Pessimizely, that you can download so you too can see a website’s Optimizely experiments. When a website uses Optimizely, the extension will alert you and attempt to highlight any elements on the page that may be subject to an experiment. If you visit nytimes.com, it will also show you alternative news headlines when you hover over a title. I suggest you give it a try!

 

The New York Times website, with headlines that may be subject to an experiment highlighted by Pessimizely.

 

The Optimizely Scripts

Our OpenWPM web crawler collects and stores javascript embedded on every page it visits. This makes it straightforward to make a query for every page that uses Optimizely and grab and analyze the code they get from Optimizely. Once collected, we investigated the scripts through regular expression-matching and manual analysis.


  "4495903114": {
      "code": …
      "name": "100000004129417_1452199599 
               [A.] New York to Appoint Civilian to Monitor Police Surveillance -- 
               [B.] Sued Over Spying on Muslims, New York Police Get Oversight",
      "variation_ids": ["4479602534","4479602535"],
      "urls": [{
        "match": "simple",
        "value": "http://www.nytimes.com"
      }],
      "enabled_variation_ids": ["4479602534","4479602535"]
    },

An example of an experiment from nytimes.com that is A/B testing two variations of a headline in a link to an article.

From a crawl of the top 100k sites in January 2016, we found and studied 3,306 different websites that use Optimizely. The Optimizely script for each site contains a data object that defines:

  1. How the website owner wants to divide users into “audiences,” based on any number of parameters like location, cookies, or user-agent.
  2. Experiments that the users might experience, and what audiences should be targeted with what experiments.

The Optimizely script reads from the data object and then executes a javascript payload and sets cookies depending on if the user is in an experimental condition. The site owner populates the data object through Optimizely’s web interface – who on a website’s development team can access that interface and what they can do is a question for the site owner. The developer also helpfully provides names for their user audiences and experiments.

In total, we found around 51,471 experiments on the 3,306 websites in our dataset that use Optimizely. On average each website has approximately 15.2 experiments, and each experiment has about 2.4 possible variations. We have only scratched the surface of some of the interesting things sites use A/B testing for, and here I’ll share a couple of the more interesting examples:

 

News publishers test the headlines users see, with differences that impact the tone of the article

A widespread use of Optimizely among news publishers is “headline testing.” To use an actual recent example from the nytimes.com, a link to an article headlined:

“Turkey’s Prime Minister Quits in Rift With President”

…to a different user might appear as…

“Premier to Quit Amid Turkey’s Authoritarian Turn.”

The second headline suggests a much less neutral take on the news than the first. That sort of difference can paint a user’s perception of the article before they’ve read a single word. We found other examples of similarly politically-sensitive headlines changing, like the following from pjmedia.com:

“Judge Rules Sandy Hook Families Can Proceed with Lawsuit Against Remington”

…could appear to some users as…

“Second Amendment Under Assault by Sandy Hook Judge.”

While editorial concerns might inform how news publishers change headlines, it’s clear that a major motivation behind headline testing is the need to drive clicks. A third variation we found for the Sandy Hook headline above is the much vaguer sounding “Huge Development in Sandy Hook Gun Case.” The Wrap, an entertainment news outlet, experimented with replacing “Disney, Paramount  Had Zero LGBT Characters in Movies Last Year” with the more obviously “click-baity” headline “See Which 2 Major Studios Had Zero LGBT Characters in 2015 Movies.”

We were able to identify 17 different news websites in our crawl that in the past have done some form of headline testing. This is most likely an undercount in our crawl — most of these 17 websites use Optimizely’s integrations with other third-party platforms like Parse.ly and WordPress for their headline testing, making them more easily identified. The New York Times website, for instance, implements its own headline testing code.

Another limitation of what we’ve found so far is that the crawls that we analyzed only visit the homepage of each site. The OpenWPM crawler could be configured, however, to browse links from within a site’s homepage and collect data from those pages. A broader study of the practices of news publishers could use the tool to drill down deeper into news sites and study their headlines over time.

 

Websites identify and categorize users based on money and affluence

Many websites target users based on IP and geolocation. But when IP/geolocation are combined with notions of money the result is surprising. The website of a popular fitness tracker targets users that originate from a list of six hard-coded IP addresses labelled “IP addresses Spending more than $1000.” Two of the IP addresses appear to be larger enterprise customers — a medical research institute a prominent news magazine. Three belong to unidentified Comcast customers. These big-spending IP addresses were targeted in the past with an experiment presented the user a button that either suggested the user “learn more” about a particular product or “buy now.”

Connectify, a large vendor of networking software, uses geolocation on a coarser level — they label visitors from the US, Australia, UK, Canada, Netherlands, Switzerland, Denmark, and New Zealand as coming from “Countries that are Likely to Pay.”

Non-profit websites also experiment with money. charity: water (charitywater.org) and the Human Rights Campaign (hrc.org) both have experiments defined to change the default donation amount a user might see in a pre-filled text box.

 

Web developers use third-party tools for more than just their intended use

A developer following the path of least resistance might use Optimizely to do other parts of their job simply because it is the easiest tool available. Some of the more exceptional “experiments” deployed by websites are simple bug-fixes, described with titles like, “[HOTFIX][Core Commerce] Fix broken sign in link on empty cart,” or “Fix- Footer links errors 404.” Other experiments betray the haphazard nature of web development, with titles like “delete me,” “Please Delete this Experiment too,” or “#Bugfix.”

We might see these unusual uses because Optimizely allows developers to edit and rollout new code with little engineering overhead. With the inclusion of one third-party script, a developer can leverage the Optimizely web interface to do a task that might otherwise take more time or careful testing. This is one example of how third-parties have evolved to become integral to the entire functionality and development of the web, raising security and privacy concerns.

 

The need for transparency

Much of the web is curated by inscrutable algorithms running on servers, and a concerted research effort is needed to shed light on the less-visible practices of websites. Thanks to the Optimizely platform we can at least peek into that secret world.

We believe, however, that transparency should be the default on the web — not the accidental product of one third-party’s engineering decisions. Privacy policies are a start, but they generally only cover a website’s data collection and third-party usage on a coarse level. The New York Times Privacy Policy, for instance, does not even suggest that headline testing is something they might do, despite how it could drastically alter your consumption of the news. If websites had to publish more information about what third-parties they use and how they use them, regulators could use that information to better protect consumers on the web. Considering the potentially harmful effects of how websites might use third-parties, more transparency and oversight is essential.

 

@dillonthehuman


* This was a conversation a year ago, when Jonathan was a grad student at Stanford.

avatar

The Princeton Web Census: a 1-million-site measurement and analysis of web privacy

Web privacy measurement — observing websites and services to detect, characterize, and quantify privacy impacting behaviors — has repeatedly forced companies to improve their privacy practices due to public pressure, press coverage, and regulatory action. In previous blog posts I’ve analyzed why our 2014 collaboration with KU Leuven researchers studying canvas fingerprinting was successful, and discussed why repeated, large-scale measurement is necessary.

Today I’m pleased to release initial analysis results from our monthly, 1-million-site measurement. This is the largest and most detailed measurement of online tracking to date, including measurements for stateful (cookie-based) and stateless (fingerprinting-based) tracking, the effect of browser privacy tools, and “cookie syncing”.  These results represent a snapshot of web tracking, but the analysis is part of an effort to collect data on a monthly basis and analyze the evolution of web tracking and privacy over time.

Our measurement platform used for this study, OpenWPM, is already open source. Today, we’re making the datasets for this analysis available for download by the public. You can find download instructions on our study’s website.

New findings

We provide background information and summary of each of our main findings on our study’s website. The paper goes into even greater detail and provides the methodological details on the measurement and analysis of each finding. One of our more surprising findings was the discovery of two apparent attempts to use the HTML5 Audio API for fingerprinting.

The figure is a visualization of the audio processing executed on users’ browsers by third-party fingerprinting scripts. We found two different AudioNode configurations in use. In both configurations an audio signal is generated by an oscillator and the resulting signal is hashed to create an identifier. Initial testing shows that the techniques may have some limitations when used for fingerprinting, but further analysis is necessary. You can help us with that (and test your own device) by using our demonstration page here.

See the paper for our analysis of a consolidated third-party ecosystem, the effects of third parties on HTTPS adoption, and examine the performance of tracking protection tools. In addition to audio fingerprinting, we show that canvas fingerprint is being used by more third parties, but on less sites; that a WebRTC feature can and is being used for tracking; and how the HTML Canvas is being used to discover user’s fonts.

What’s next? We are exploring ways to share our data and analysis tools in a form that’s useful to a wider and less technical audience. As we continue to collect data, we will also perform longitudinal analyses of web tracking. In other ongoing research, we’re using the data we’ve collected to train machine-learning models to automatically detect tracking and fingerprinting.

avatar

Is Tesla Motors a Hidden Warrior for Consumer Digital Privacy?

Amid the privacy intrusions of modern digital life, few are as ubiquitous and alarming as those perpetrated by marketers. The economics of the entire industry are built on tools that exist in shadowy corners of the Internet and lurk about while we engage with information, products and even friends online, harvesting our data everywhere our mobile phones and browsers dare to go.

This digital marketing model, developed three decades ago and premised on the idea that it’s OK for third parties to gather our private data and use it in whatever way suits them, will grow into a $77 billion industry in the U.S. this year, up from $57 billion in 2014, according to Forrester Research.

Storm clouds are developing around the industry, however, and there are new questions being raised about the long-term viability of surreptitious data-gathering as a sustainable business model. Two factors are typically cited: Regulators in Europe have begun, and those in the U.S. are poised to begin, reining in the most intrusive of these marketing practices; and the growth of the mobile Internet, and the related reliance on apps rather than browsers for 85% of our mobile online activity, have made it more difficult to gather user data.

Then there is Tesla Motors and its advertising-averse marketing model, which does not use third-party data to raise awareness and interest in its brand, drive desire for its products or spur action by its customers. Instead, the electric carmaker relies on cultural branding, a concept popularized recently by Douglas Holt, formerly of the Harvard Business School, to do much of the marketing heavy lift that brought it to the top of the electric vehicle market. And while Tesla is not the only brand engaging digital crowd culture and shunning third-party data-gathering, its success is causing the most consternation within the ranks of intrusion marketers.

[Read more…]

avatar

The Interconnection Measurement Project

Building on the March 11 release of the “Revealing Utilization at Internet Interconnection Points” working paper, today, CITP is excited to announce the launch of the Interconnection Measurement Project. This unprecedented initiative includes the launch of a project-specific website and the ongoing collection, analysis, and release of capacity and utilization data from ISP interconnection points. CITP’s Interconnection Measurement Project uses the same method that I detailed in the working paper and includes the participation of seven ISPs—Bright House Networks, Comcast, Cox, Mediacom, Midco, Suddenlink, and Time Warner Cable.

The project website—which we aim to update regularly—includes additional views of the data that are not included in the working paper. The visualizations are organized into three categories: (1) Aggregate Views; (2) Regional Views; and (3) Views by Interconnect. The Aggregate Views provide peak utilization, growth in capacity and usage, as well as the distribution of peak utilization across interconnects and across participating ISPs, on a monthly basis across the entire data set. The Regional Views provide monthly peak utilization by region and distribution of peak utilization across interconnects by region. Finally, the Views by Interconnect provide details into daily per-link utilization statistics, as well as the distribution of peak utilization by link and by capacity, also on a monthly basis.The website visualizations also include an additional month of data (March 2016) beyond what the original working paper included. CITP plans to regularly update the visualizations with new data to provide a picture of how the Internet is evolving, and we will assess the project annually to ensure that the data, reports, and insights that we offer remain relevant.

The March data is consistent with the initial findings detailed in the working paper: that many interconnects have significant spare capacity, that this spare capacity exists both across ISPs in each region and in aggregate for any individual ISP, and that the aggregate utilization across interconnects is roughly 50 percent during peak periods.

The seven participating ISPs collectively account for about 50 percent of all US broadband subscribers. We at CITP hope that these ISPs are merely the pioneers of what may eventually become a much larger effort. As we continue to advance this field of research and deepen our understanding of traffic characteristics at interconnection points, we welcome the participation of even more ISPs as well as other network operators and edge providers in this important effort.

avatar

Apple Encryption Saga and Beyond: What U.S. Courts Can Learn from Canadian Caselaw

It has been said that privacy is “at risk of becoming a real human right.” The exponential increase of personal information in the hands of organizations, particularly sensitive data, creates a significant rise in the perils accompanying formerly negligible privacy incidents. At one time considered too intangible to merit even token compensation, risks of harm to privacy interests have become so ubiquitous in the past three years that they require special attention.

Legal and social changes have for their part also increased potential privacy liability for private and public entities when they promise – and fail – to guard our personal data (think Ashley Madison…). First among those changes has been the emergence of a “privacy culture” — a process bolstered by the trickle-down effect of the Julia Angwin’s investigative series titled “What They Know,” and the heightened attention that the mainstream media now attaches to privacy incidents. Second, courts in various common law jurisdictions are beginning to recognize intangible privacy harms and have been increasingly willing to certify class action lawsuits for privacy infringements that previously would have been summarily dismissed without hesitation.

Prior to 2012, it was difficult to find examples of judicially recognized losses arising from privacy breaches. Since then however, the legal environment in common law jurisdictions and in Canada in particular has changed dramatically. Claims related to privacy mishaps are now commonplace, and there has been an exponential multiplication in the number of matters involving inadvertent communication or improper disposal of personal data, portable devices, and cloud computing.
[Read more…]

avatar

The Defend Trade Secrets Act and Whistleblowers

As Freedom to Tinker readers know, I’ve been an active opponent of the federal Defend Trade Secrets Act (DTSA). Though my position on the DTSA remains unchanged, I was both surprised and pleased to see that the revised Defend Trade Secrets Act now includes a narrow, but potentially useful, provision intended to protect whistleblowers from trade secret misappropriation actions.

As attendees at yesterday’s wonderful CITP talk by Bart Gellman were fortunate to hear, whistleblowing remains a critical but imperfect tool of public access to the internal operations of our institutions, from corporations to government. Trade secrecy operates in the opposite direction, and has the robust ability to thwart regulation, limit public accountability, and criminalize whistleblowing. I’ve regularly called trade secrecy the most powerful intellectual property law (IP) tool of information control, as it prevents not just use of, but access to and even knowledge about the very existence of information. Indeed, it surpasses other IP law in that power by a wide margin. Thus, if the DTSA is moving forward, the inclusion of even a limited whistleblower exception in the DTSA is a good thing.

Nonetheless, it is very important to recognize what this provision won’t achieve. As written, the provision prevents liability under federal and state trade secret law for “the disclosure of a trade secret that … is made … in confidence to a Federal, State, or local government official, either directly or indirectly, or to an attorney; and … solely for the purpose of reporting or investigating a suspected violation of law; or … is made in a complaint or other document filed in a lawsuit or other proceeding, if such filing is made under seal.” Thus, as written, the provision does not appear to immunize sharing trade secret information with the press or the public at large. As Gellman’s work has shown, the press is often the first and only avenue for access to critical information about our public and private black boxes.

[Read more…]

avatar

Internet Voting? Really?

Recently I gave a TEDx talk—I spoke at the local Princeton University TEDx event.  My topic was voting: America’s voting systems in the 19th and 20th century, and should we vote using the Internet?  You can see the talk here:

 

Internet Voting? Really?

 

avatar

On distracted driving and required phone searches

A recent Arstechnica article discussed several U.S. states that are considering adding a “roadside textalyzer” that operates analogously to roadside Breathalyzer tests. In the same way that alcohol and drugs can impair a driver’s ability to navigate the road, so can paying attention to your phone rather than the world beyond. Many states “require” drivers to consent to Breathalyzer tests, where that “requirement” boils down to serious penalties if the driver declines. Vendors like Cellebrite are pushing for analogous requirements, for which they just happen to sell products.
[Read more…]

avatar

Gone In Six Characters: Short URLs Considered Harmful for Cloud Services

[This is a guest post by Vitaly Shmatikov, professor at Cornell Tech and once upon a time my adviser at the University of Texas at Austin. — Arvind Narayanan.]

TL;DR: short URLs produced by bit.ly, goo.gl, and similar services are so short that they can be scanned by brute force.  Our scan discovered a large number of Microsoft OneDrive accounts with private documents.  Many of these accounts are unlocked and allow anyone to inject malware that will be automatically downloaded to users’ devices.  We also discovered many driving directions that reveal sensitive information for identifiable individuals, including their visits to specialized medical facilities, prisons, and adult establishments.

URL shorteners such as bit.ly and goo.gl perform a straightforward task: they turn long URLs into short ones, consisting of a domain name followed by a 5-, 6-, or 7-character token.  This simple convenience feature turns out to have an unintended consequence.  The tokens are so short that the entire set of URLs can be scanned by brute force.  The actual, long URLs are thus effectively public and can be discovered by anyone with a little patience and a few machines at her disposal.

Today, we are releasing our study, 18 months in the making, of what URL shortening means for the security and privacy of cloud services.  We did not perform a comprehensive scan of all short URLs (as our analysis shows, such a scan would have been within the capabilities of a more powerful adversary), but we sampled enough to discover interesting information and draw important conclusions.  Our study focused on two cloud services that directly integrate URL shortening: Microsoft OneDrive cloud storage (formerly known as SkyDrive) and Google Maps.  In both cases, whenever a user wants to share a link to a document, folder, or map with another user, the service offers to generate a short URL – which, as we show, unintentionally makes the original URL public.
[Read more…]

avatar

Why Making Johnny’s Key Management Transparent is So Challenging

In light of the ongoing debate about the importance of using end-to-end encryption to protect our data and communications, several tech companies have announced plans to increase the encryption in their services. However, this isn’t a new pledge: since 2014, Google and Yahoo have been working on a browser plugin to facilitate sending encrypted emails using their services. Yet in recent weeks, some have criticized that only alpha releases of these tools exist, and have started asking why they’re still a work in progress.

One of the main challenges to building usable end-to-end encrypted communication tools is key management. Services such as Apple’s iMessage have made encrypted communication available to the masses with an excellent user experience because Apple manages a directory of public keys in a centralized server on behalf of their users. But this also means users have to trust that Apple’s key server won’t be compromised or compelled by hackers or nation-state actors to insert spurious keys to intercept and manipulate users’ encrypted messages. The alternative, and more secure, approach is to have the service provider delegate key management to the users so they aren’t vulnerable to a compromised centralized key server. This is how Google’s End-To-End works right now. But decentralized key management means users must “manually” verify each other’s keys to be sure that the keys they see for one another are valid, a process that several studies have shown to be cumbersome and error-prone for the vast majority of users. So users must make the choice between strong security and great usability.

In August 2015, we published our design for CONIKS, a key management system that addresses these usability and security issues. CONIKS makes the key management process transparent and publicly auditable. To evaluate the viability of CONIKS as a key management solution for existing secure communication services, we held design discussions with experts at Google, Yahoo, Apple and Open Whisper Systems, primarily over the course of 11 months (Nov ‘14 – Oct ‘15). From our conversations, we learned about the open technical challenges of deploying CONIKS in a real-world setting, and gained a better understanding for why implementing a transparent key management system isn’t a straightforward task.
[Read more…]