December 13, 2024

Watching You Watch: The Tracking Ecosystem of Over-the-Top TV Streaming Devices

By Hooman Mohajeri Moghaddam, Gunes Acar, Ben Burgess, Arunesh Mathur, Danny Y. Huang, Nick Feamster, Ed Felten, Prateek Mittal, and Arvind Narayanan

By 2020 one third of US households are estimated to “cut the cord”, i.e., discontinue their multichannel TV subscriptions and switch to internet-connected streaming services. Over-the-Top (“OTT”) streaming devices such as Roku and Amazon Fire TV, which currently sell between for $30 to $100, are cheap alternatives to smart TVs for cord-cutters. Instead of charging more for the hardware or the membership, Roku and Amazon Fire TV monetize their platforms through advertisements, which rely on tracking users’ viewing habits.

Although tracking of users on the web and on mobile is well studied, tracking on smart TVs and OTT devices has remained unexplored. To address this gap, we conducted the first study of tracking on OTT platforms. In a paper that we will present at the ACM CCS 2019 conference, we found that: 

  • Major online trackers such as Google and Facebook are also highly prominent in the OTT ecosystem. However, OTT channels also contain niche and lesser known trackers such as adrise.tv and monarchads.com.
  • The information shared with tracker domains includes video titles (see Figure 1), channel names, permanent device identifiers and wireless SSIDs.
  • Countermeasures made available to users are ineffective at preventing tracking.
  • Roku had a vulnerability that allowed malicious web pages visited by Roku users to geolocate users, read device identifiers and install channels without their consent.
 Figure 1. AsianCrush channel on Roku sends the device ID and video title to online video advertising platform spotxchange.com

Method and Findings:

Similar to how Android or iOS supports third-party apps, Amazon and Roku support third-party applications known as channels, ranging from popular channels like Netflix and CNN to several obscure ones.

Automation is one of the main challenges of studying how these channels track users. Tools that automate interaction with web pages (such as Selenium) do not exist for OTT platforms. To address this challenge, we developed a system that can automatically download OTT channels and interact with them all while intercepting the network traffic and performing best-effort TLS interception. We describe the different components of our tool in the Appendix. Using this crawler we collected data from the top 1000 channels on both Roku and the Amazon Fire TV channel stores.

The distribution of trackers by channel category and rank is shown in Figure 2. The “Games” category of Roku channels contact the most trackers: nine of the top ten channels (ordered by the number of trackers) are categorized as game channels. On the other hand, five of the ten Fire TV channels with the most trackers are “News” channels, where the top three channels contact close to 60 tracker domains each. Below we summarize our findings:

Figure 2. Distribution of trackers by channel ranks and channel categories.

Google and Facebook are among the most popular trackers

Google and Facebook domains (doubleclick.net, google-analytics.com, googlesyndication.com and facebook.com) are among the most prevalent trackers in the OTT channels on both platforms we studied. Google’s doubleclick.net appeared on 975 of the top 1000 Roku channels, while amazon-adsystem.com appeared on 687 of the top 1000 Amazon Fire TV channels.

Table 1. Most prevalent trackers on top 1000 channels on Roku (left) and Amazon (right).

User and device identifiers shared with trackers

Trackers have access to a wide range of device and user identifiers on OTT platforms. Some of these identifiers can be reset by users (e.g., Advertising IDs), while others are permanent (e.g., serial numbers, MAC addresses). To detect the identifiers shared with trackers, we followed the method described by Englehardt et al. to search for device and user identifiers in the network traffic of the top 1000 channels for each platform. This allowed us to detect leaks even when the identifiers were encoded or hashed. An overview of the leaked IDs on each platform is given in Table 2.

Table 2. Overview of identifier and information leakage detected in the Roku (left) and the FireTV (right) crawls.

Channels share video titles with third-party trackers

Out of 100 randomly selected channels on Roku and Amazon, we found 9 channels on Roku (e.g., “CBS News” and “News 5 Cleveland WEWS”)  and 14 channels on the Fire TV (e.g., “NBC News” and “Travel Channel”) that leaked the title of the video to a tracking domain. On Roku, all video titles were leaked over unencrypted connections, exposing user video history to eavesdroppers. On Fire TV, only two channels (NBC News and WRAL) used an unencrypted connection when sending the title to tracking domains.

Overwhelming majority of the channels use unencrypted connections

Out of the 1000 channels we studied on Roku and Amazon Fire TV, 794 channels on Roku and 762 on Amazon Fire TV had at least one unencrypted HTTP session, potentially exposing users’ information and identities to network adversaries.

Countermeasures

OTT platforms provide privacy options that purport to limit tracking on their devices: “Limit Ad Tracking” on Roku and ”Disable Interest-based Ads” on Amazon Fire TV. Our measurements show that these privacy options fall short of preventing tracking. Turning on these options did not change the number of trackers contacted. Turning on “Limit Ad Tracking” on Roku reduced the number of AD ID leaks from 390 to zero, but did not change the number of serial number leaks.

Roku Remote Control API Vulnerability

To investigate other ways OTT devices may compromise user privacy and security, we analyzed local API endpoints of Roku and Fire TV. OTT devices expose such interfaces to enable debugging, remote control, and home automation by mobile apps and other automation software. We discovered a vulnerability in the Roku’s remote control API that allows an attacker to:

  • send commands to install/uninstall/launch channels and collect unique identifiers from Roku devices – even when the connected display is turned off.
  • geolocate Roku users via the SSID of the wireless network
  • extract MAC address, serial number, and other unique identifiers to track users or respawn tracking identifiers (similar to evercookies).
  • get the list of installed channels and use it for profiling purposes.

We reported the vulnerability to Roku in December 2018. Roku addressed the issue and finalized rolling out their security fix by March 2019.

Going forward

Our research shows that users, who are already being pervasively tracked on the web and mobile, face another set of privacy-intrusive tracking practices when using their OTT streaming platforms. A combination of technical and policy solutions can be considered when addressing these privacy and security issues. OTT platforms should offer better privacy controls, similar to Incognito/Private Browsing Mode of modern web browsers. Insecure connections should be disincentivized by platform policies. For example, clear-text connections should be blocked unless an exception is requested by the channel. Regulators and policy makers should ensure the privacy protections available for brick and mortar video rental services, such as Video Privacy Protection Act (VPPA), are updated to cover emerging OTT platforms.

Appendix

Crawler architecture:

We set out to build a crawler to study tracking and privacy practices of OTT channels at scale. Our crawler installs a channel, launches it, and attempts to view a video on the channel, while collecting network traffic and attempting “best-effort” TLS interception. The crawler consists of a number of different hardware devices:

  • A desktop machine connected to the Internet acts as a wireless access point (AP).
  • An OTT stick connects to the Internet via the WiFi AP provided by the desktop machine. It also connects to a TV through an HDMI Capture and Split Card to sidestep the HDCP protections.

The desktop machine orchestrates our crawls and has the following software components:

  • Automatic interaction engine:
    • Remote Control API: OTT platforms provide an API to enable remote control apps to send commands such as switching or installing channels. We wrote our own wrappers for both Roku and Amazon Fire TV’s remote APIs.
    • Audio/Video processing: We process the audio from the OTT device on the desktop machine and use it to detect video playback, which guides our automatic interaction with channels. Video input is also saved as screenshots for post-processing and validation.
  • Network Capture: We collect network traffic of the OTT devices as pcap files and dump all DNS transactions in a Redis database.
  • TLS interception: We use mitmproxy to perform “best-effort” TLS interception. For each channel and each new TLS endpoint, we attempt to intercept the traffic using a self-signed certificate. If the interception fails, we add the endpoint to a no-intercept list to avoid further interception attempts. On Amazon Fire TV, we manage to root the device using a previously known vulnerability, and install mitmproxy’s self-signed certificate on the device certificate store. In addition, we use Frida to bypass certificate pinning.
Figure 3. Overview of our smart crawler.

Comments

  1. James Cridland says

    Even watching DNS requests from devices is interesting. https://imgur.com/gallery/uGRfPSS – shows my Fetch box (an Australian service) which has been turned off all day. It seems awfully keen to chat to Netflix’s servers. It’s made over 450 calls to Netflix in the past day.

    My Telstra TV box, a rebadged Roku device, similarly – https://imgur.com/gallery/cf3v6tx – quite likes Netflix too. That hasn’t been turned on for a month or so: and I don’t remember using the Netflix integration on it for a number of months. It’s made over 200 calls in the past day to Netflix.

  2. How about isolating the ONLY connections required for video delivery so we can block the Roku’s internet access except for those required connections. You don’t list the domains we need to do that.

    I currently use the Roku blocklist circulating the web, but it doesn’t list 64 domains.

  3. Was there some reason AppleTV wasn’t included in the study?

    • That’s my question, too. Does Apple TV (the hardware device) provide inherently superior privacy protection? Do its security measures make it harder to compromise for the purpose of implementing the crawler?

      • We are planning to include a number of other platforms, including Apple TV, in our future work. Each platform comes with its own set of remote control APIs that has to be integrated into our crawler. Depending on how the API is implemented and what functionalities it provides, the integration process might differ. For instance, in Roku channel store, one can refer to channels with a unique channel ID and install or launch them using the channel ID, but in Amazon Fire TV, the user has to either side-load the APK or search for the channel on the device.

  4. The answer to “who will watch the watchers”.