April 26, 2018

Announcing IoT Inspector: Studying Smart Home IoT Device Behavior

By Noah Apthorpe, Danny Y. Huang, Gunes Acar, Frank Li, Arvind Narayanan, Nick Feamster

An increasing number of home devices, from thermostats to light bulbs to garage door openers, are now Internet-connected. This “Internet of Things” (IoT) promises reduced energy consumption, more effective health management, and living spaces that react adaptively to users’ lifestyles. Unfortunately, recent IoT device hacks and personal data breaches have made security and privacy a focal point for IoT consumers, developers, and regulators.

Many IoT vulnerabilities sound like the plot of a science fiction dystopia. Internet-connected dolls allow strangers to spy on children remotely. Botnets of millions of security cameras and DVRs take down a global DNS service provider. Surgically implanted pacemakers are susceptible to remote takeover.

These security vulnerabilities, combined with the rapid evolution of IoT products, can leave consumers at risk, and in the dark about the risks they face when using these devices. For example, consumers may be unsure which companies receive personal information from IoT appliances, whether an IoT device has been hacked, or whether devices with always-on microphones listen to private conversations.

To shed light on the behavior of smart home IoT devices that consumers buy and install in their homes, we are announcing the IoT Inspector project.

Announcing IoT Inspector: Studying IoT Security and Privacy in Smart Homes

Today, at the Center for Information Technology Policy at Princeton, we are launching an ongoing initiative to study consumer IoT security and privacy, in an effort to understand the current state of smart home security and privacy in ways that ultimately help inform both technology and policy.

We have begun this effort by analyzing more than 50 home IoT devices ourselves. We are working on methods to help scale this analysis to more devices. If you have a particular device or type of device that you are concerned about, let us know. To learn more, visit the IoT Inspector website.

Our initial analyses have revealed several findings about home IoT security and privacy.

[Read more…]

Ethics Education in Data Science

Data scientists in academia and industry are increasingly recognizing the importance of integrating ethics into data science curricula. Recently, a group of faculty and students gathered at New York University before the annual FAT* conference to discuss the promises and challenges of teaching data science ethics, and to learn from one another’s experiences in the classroom. This blog post is the first of two which will summarize the discussions had at this workshop.

There is general agreement that data science ethics should be taught, but less consensus about what its goals should be or how they should be pursued. Because the field is so nascent, there is substantial room for innovative thinking about what data science ethics ought to mean. In some respects, its goal may be the creation of “future citizens” of data science who are invested in the welfare of their communities and the world, and understand the social and political role of data science therein. But there are other models, too: for example, an alternative goal is to equip aspiring data scientists with technical tools and organizational processes for doing data science work that aligns with social values (like privacy and fairness). The group worked to identify some of the biggest challenges in this field, and when possible, some ways to address these tensions.

One approach to data science ethics education is including a standalone ethics course in the program’s curriculum. Another option is embedding discussions of ethics into existent courses in a more integrated way. There are advantages and disadvantages to both options. Standalone ethics courses may attract a wider variety of students from different disciplines than technical classes alone, which provides potential for rich discussions. They allow professors to cover basic normative theories before diving into specific examples without having to skip the basic theories or worry that students covered them in other course modules. Independent courses about ethics do not necessarily require cooperation from multiple professors or departments, making them easier to organize. However, many worry that teaching ethics separately from technical topics may marginalize ethics and make students perceive it as unimportant. Further, standalone courses can either be elective or mandatory. If elective, they may attract a self-selecting group of students, potentially leaving out other students who could benefit from exposure to the material; mandatory ethics classes may be seen as displacing other technical training students want and need. Embedding ethics within existent CS courses may avoid some of these problems and can also elevate the discourse around ethical dilemmas by ensuring that students are well-versed in the specific technical aspects of the problems they discuss.

Beyond course structure, ethics courses can be challenging for data science faculty to teach effectively. Many students used to more technical course material are challenged by the types of learning and engagement required in ethics courses, which are often reading-heavy. And the “answers” in ethics courses are almost never clear-cut. The lack of clear answers or easily constructed rubrics can complicate grading, since both students and faculty in computer science may be used to grading based on more objective criteria. However, this problem is certainly not insurmountable – humanities departments have dealt with this for centuries, and dialogue with them may illuminate some solutions to this problem. Asking students to complete frequent but short assignments rather than occasional long ones may make grading easier, and also encourages students to think about ethical issues on a more regular basis.

Institutional hurdles can hinder a university’s ability to satisfactorily address questions of ethics in data science. A dearth of technical faculty may make it difficult to offer a standalone course on ethics. A smaller faculty may push a university towards incorporating ethics into existent CS courses rather than creating a new class. Even this, however, requires that professors have the time and knowledge to do so, which is not always the case.

The next blog post will enumerate topics discussed and assignments used in courses that discuss ethics in data science.

Thanks to Karen Levy and Kathy Pham for their edits on a draft of this post.

Routing Attacks on Internet Services

by Yixin Sun, Annie Edmundson, Henry Birge-Lee, Jennifer Rexford, and Prateek Mittal

[In this post, we discuss a recent thread of research that highlights the insecurity of Internet services due to the underlying insecurity of Internet routing. We hope that this thread facilitates important dialog in the networking, security, and Internet policy communities to drive change and adoption of secure mechanisms for Internet routing]

The underlying infrastructure of the Internet comprises physical connections between more than 60,000 entities known as Autonomous Systems (such as AT&T and Verizon). Internet routing protocols such as the Border Gateway Protocol (BGP) govern how our communications are routed over a series of autonomous systems to form an end-to-end communication channel between a sender and receiver.

Unfortunately, Internet routing protocols were not designed with security in mind. The insecurity in the BGP protocol allows potential adversaries to manipulate how routing on the Internet occurs. For example, see this recent real-world example of BGP attacks against Mastercard, Visa, and Symantec. The insecurity of BGP is well known, and a number of protocols have been designed to secure Internet routing. However, we are a long ways away from large-scale deployment of secure Internet routing protocols.  

This status quo is unacceptable.

Historically, routing attacks have been viewed primarily from the perspective of an attack on availability of Internet applications.  For example, an adversary can hijack Internet traffic towards a victim application server and cause unavailability (see YouTube’s 2008 hijack). A secondary perspective is that of confidentiality of unencrypted Internet communications. For example, an adversary can manipulate Internet routing to position itself on the communication path between a client and the application server and record unencrypted traffic: http://dyn.com/blog/mitm-internet-hijacking/

In this post, we  argue that conventional wisdom significantly underestimates the vulnerabilities introduced due to insecurity of Internet routing. In particular, we discuss recent research results that exploit BGP insecurity to attack the Tor network, TLS encryption, and the Bitcoin network.

BGP attacks on anonymity systems/Tor: The Tor network is a deployed system for anonymous communication that aims to protect user identity (IP address) in online communications. The Tor network comprises of over 7,000 relays which together carry terabytes of traffic every day. Tor serves millions of users, including political dissidents, whistle-blowers, law-enforcement, intelligence agencies, journalists, businesses and ordinary citizens concerned about the privacy of their online communications.

Tor clients redirect their communications via a series of proxies for anonymous communication. Layered encryption is used such that each proxy only observes the identity of the previous hop and the next hop in the communication, and no single proxy observes the identities of both the client and the destination.

However, if an adversary can observe the traffic from the client to the Tor network, and from the Tor network to the destination, then it can leverage correlation between packet timing and sizes to infer the network identities of clients and servers (end-to-end timing analysis). Therefore, an adversary can first use BGP attacks to hijack or intercept Internet traffic towards the Tor network (Tor relays), and perform traffic analysis of encrypted communications to compromise user anonymity.

It is important to note that this timing analysis works even if the communication is encrypted. This illustrates an important point — the insecurity of Internet routing has important consequences for traffic-analysis attacks, which allow adversaries to infer sensitive information from communication meta-data (such as source IP, destination IP, packet size and packet timing), even if communication is encrypted.

We introduced the threat of “Routing Attacks on Privacy in Tor” (RAPTOR attacks) at USENIX Security in 2015. We demonstrated the feasibility of RAPTOR attacks on the Tor network by performing real-world Internet routing manipulation in a controlled and ethical manner.  Interested readers can see the technical paper and our project webpage for more details.

Routing attacks challenge conventional beliefs about security of anonymity systems, and also have broad applicability to low-latency anonymous communication (including systems beyond Tor, such as I2P). Our work also motivates the design of anonymity systems that successfully resist the threat of Internet routing manipulation. The Tor project is already implementing design changes (such as Tor proposal 247 and Tor proposal 271) that make it harder for an adversary to infer and manipulate the client’s entry point (proxy) into the Tor network. Our follow-up work on Counter-RAPTOR defenses (presented at the IEEE Security and Privacy Symposium in 2017) presents a monitoring framework to analyze routing updates for the Tor network, which is being integrated into the Tor metrics portal.

BGP attacks on TLS/Digital Certificates: The Transport Layer Security (TLS) protocol allows a client to establish a secure communication channel with a destination website using cryptographic key exchange protocols. To prevent man-in-the-middle attacks, clients using the TLS protocols need to authenticate the public key corresponding to the destination site, such as a web-server. Digital certificates issued by trusted Certificate Authorities (such as Let’s Encrypt) provide an authentic binding between destination server and its public key, allowing a client to validate the destination server. Given the widespread use of TLS for secure Internet communications, the security of the digital certificate ecosystem is paramount.  

We have shown that the process for obtaining digital certificates from trusted certificate authorities (called domain validation) is vulnerable to attack.

A domain owner can perform a Certificate Signing Request (CSR) to a trusted Certificate Authority to obtain a digital certificate.  The Certificate Authority must verify that the party submitting the request actually has control over the domains that are covered by that CSR. This process is known as domain control verification and is a core part of the Public Key Infrastructure (PKI) used in the TLS protocol.

In our ongoing work in progress, presented at the HotPETS workshop in 2017, we demonstrated the feasibility of exploiting BGP attacks to compromise the domain validation protocol. For example,  HTTP domain verification is a common method of domain control verification that requires the domain owner to upload a string specified by the CA to a specific HTTP URL at the domain. The CA can then verify the domain via a HTTP GET request. However, an adversary can manipulate inter-domain routing via BGP attacks to intercept all traffic towards the victim web-server, and successfully obtain a fraudulent digital certificate by spoofing a HTTP response corresponding to the CA challenge message. We have performed real-world Internet routing manipulation in a controlled and ethical manner to demonstrate the feasibility of these attacks. See our attack demonstration video for a demo.

This attack has significant consequences for privacy of our online communications, as adversaries can bypass cryptographic protection offered by encryption using fraudulently obtained digital certificates. Our work is leading to deployment of suggested countermeasures (verification from multiple vantage points) at Let’s Encrypt. Please see the Let’s Encrypt deployment for more details.

So far, we have discussed our research results from Princeton University. Below, I’ll briefly discuss research from Laurent Vanbever’s group at ETHZ and Sharon Goldberg’s Group at Boston University that have shown that it is possible to use inter-domain routing manipulation for attacking Bitcoin and for bypassing legal protections.

BGP attacks on Crypto-currencies/Bitcoin: BGP manipulation can be used to perform two main types of attacks on crypto-currencies such as Bitcoin: (1) partitioning attacks, in which an adversary aims to disconnect a set of victim Bitcoin nodes from the network, or (2) delaying attacks, in which an adversary can slow down the propagation of data towards victim Bitcoin nodes. Both of these attacks result in potential economic loss to Bitcoin nodes.

BGP attacks for bypassing legal protections: Domestic communications between US citizens have legal protections against surveillance. However, adversaries can manipulate inter-domain routing such that the actual communication path involves a foreign country, which could invalidate the legal protections and allow large-scale surveillance of online communications.

Concluding Thoughts:  The emergence of routing attacks on anonymity systems, Internet domain validation, and cryptocurrencies showcases that conventional wisdom has significantly underestimated the attack surface introduced due to the insecurity of Internet routing. It is imperative for critical Internet applications to be aware of the insecurity of Internet routing, and analyze the resulting security threats.

Given the vulnerabilities in Internet routing, applications should consider domain specific defense mechanisms for enhancing user security and privacy. Examples include our Counter-RAPTOR analytics for Tor and Multiple vantage point defense for domain validation). We hope that our work, and the research discussed above is an enabler for this vision.

While it is important to design and deploy application-specific defenses for protecting our systems against routing attacks that exploit current insecure Internet infrastructure, it is even more important to rethink the status quo of insecure routing protocols. Our ultimate goal ought to be to fundamentally eliminate the insecurity in today’s Internet routing protocols by moving towards the adoption of secure countermeasures. How do we drive this change?