September 24, 2018

Internet of Things in Context: Discovering Privacy Norms with Scalable Surveys

by Noah Apthorpe, Yan Shvartzshnaider, Arunesh Mathur, Nick Feamster

Privacy concerns surrounding disruptive technologies such as the Internet of Things (and, in particular, connected smart home devices) have been prevalent in public discourse, with privacy violations from these devices occurring frequently. As these new technologies challenge existing societal norms, determining the bounds of “acceptable” information handling practices requires rigorous study of user privacy expectations and normative opinions towards information transfer.

To better understand user attitudes and societal norms concerning data collection, we have developed a scalable survey method for empirically studying privacy in context.  This survey method uses (1) a formal theory of privacy called contextual integrity and (2) combinatorial testing at scale to discover privacy norms. In our work, we have applied the method to better understand norms concerning data collection in smart homes. The general method, however, can be adapted to arbitrary contexts with varying actors, information types, and communication conditions, paving the way for future studies informing the design of emerging technologies. The technique can provide meaningful insights about privacy norms for manufacturers, regulators, researchers and other stakeholders.  Our paper describing this research appears in the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies.

Scalable CI Survey Method

Contextual integrity. The survey method applies the theory of contextual integrity (CI), which frames privacy in terms of the appropriateness of information flows in defined contexts. CI offers a framework to describe flows of information (attributes) about a subject from a sender to a receiver, under specific conditions (transmission principles).  Changing any of these parameters of an information flow could result in a violation of privacy.  For example, a flow of information about your web searches from your browser to Google may be appropriate, while the same information flowing from your browser to your ISP might be inappropriate.

Combinatorial construction of CI information flows. The survey method discovers privacy norms by asking users about the acceptability of a large number of information flows that we automatically construct using the CI framework. Because the CI framework effectively defines an information flow as a tuple (attributes, subject, sender, receiver, and transmission principle), we can automate the process of constructing information flows by defining a range of parameter values for each tuple and generating a large number of flows from combinations of parameter values.

Applying the Survey Method to Discover Smart Home Privacy Norms

We applied the survey method to 3,840 IoT-specific information flows involving a range of device types (e.g., thermostats, sleep monitors), information types (e.g., location, usage patterns), recipients (e.g., device manufacturers, ISPs) and transmission principles (e.g., for advertising, with consent). 1,731 Amazon Mechanical Turk workers rated the acceptability of these information flows on a 5-point scale from “completely unacceptable” to “completely acceptable”.

Trends in acceptability ratings across information flows indicate which context parameters are particularly relevant to privacy norms. For example, the following heatmap shows the average acceptability ratings of all information flows with pairwise combinations of recipients and transmission principles.

Average acceptability scores of information flows with given recipient/transmission principle pairs.

Average acceptability scores of information flows with given recipient/transmission principle pairs. For example, the top left box shows the average acceptability score of all information flows with the recipient “its owner’s immediate family” and the transmission principle “if its owner has given consent.” Higher (more blue) scores indicate that flows with the corresponding parameters are more acceptable, while lower (more red) scores indicate that the flows are less acceptable. Flows with the null transmission principle are controls with no specific condition on their occurrence. Empty locations correspond to less intuitive information flows that were excluded from the survey. Parameters are sorted by descending average acceptability score for all information flows containing that parameter.

These results provide several insights about IoT privacy, including the following:

  • Advertising and Indefinite Data Storage Generally Violate Privacy Norms. Respondents viewed information flows from IoT devices for advertising or for indefinite storage as especially unacceptable. Unfortunately, advertising and indefinite storage remain standard practice for many IoT devices and cloud services.
  • Transitive Flows May Violate Privacy Norms. Consider a device that sends its owner’s location to a smartphone, and the smartphone then sends the location to a manufacturer’s cloud server. This device initiates two information flows: (1) to the smartphone and (2) to the phone manufacturer. Although flow #1 may conform to user privacy norms, flow #2 may violate norms. Manufacturers of devices that connect to IoT hubs (often made by different companies), rather than directly to cloud services, should avoid having these devices send potentially sensitive information with greater frequency or precision than necessary.

Our paper expands on these findings, including more details on the survey method, additional results, analyses, and recommendations for manufacturers, researchers, and regulators.

We believe that the survey method we have developed is broadly applicable to studying societal privacy norms at scale and can thus better inform privacy-conscious design across a range of domains and technologies.