March 26, 2019

Do Mobile News Alerts Undermine Media’s Role in Democracy? Madelyn Sanfilippo at CITP

Why do different people sometimes get different articles about the same event, sometimes from the same news provider? What might that mean for democracy?

Speaking at CITP today is Dr. Madelyn Rose Sanfilippo, a postdoctoral research associate here at CITP. Madelyn empirically studies the governance of sociotechnical systems, as well as outcomes, inequality, and consequences within these systems–through mixed method research design.

Today, Madelyn tells us about a large scale project with Yafit Lev-Aretz  to examine how push notifications and personalized distribution and consumption of news might influence readers and democracy. The project is funded by the Tow Center for Digital Journalism at Columbia University and the Knight Foundation.

Why Do Push Notification Matters for Democracy?

Americans’ trust in media have been diverging in recent years, even as society worries about the risks to democracy from echo chambers. Madelyn also tells us about changes in how Americans get their news.

Push notifications are one of those changes– news organizations that send alerts to people’s computers and to our mobile phones about news they think are important. And we get a lot of them. In 2017, Tow Center researcher Pete Brown found that people get almost one push notification per minute on their phones– interrupting us with news.

In 2017, 85% of Americans were getting news via their mobile devices, and while it’s not clear how many of that came from push notifications, mobile phones tend to come with news apps that have push notifications enabled by default.

When Madelyn and Yafit  started to analyze push notifications, they noticed something fascinating: the same publisher often pushes different headlines to different platforms. They also found that news publishers use language with less objectivity and more subjective, emotional content in those notifications.

Madelyn and Yafit especially wanted to know if media outlets covered breaking news differently based on political affiliation of their readers. Comparing notifications of disasters, gun violence, and terrorism, they found differences in the number of push notifications published by publishers with higher and lower affiliation. They also found differences in the machine-coded subjectivity and objectivity of how these publishers covered those stories.

Composite subjectivity of different sources (higher is more subjective)

Do Push Notifications Create Political Filter Bubbles?

Finally, Madelyn and Yafit wanted to know if the personalization of push notifications shaped what people might be aware of. First, Madelyn explains to us that personalization takes multiple forms:

  • Curation: sometimes which articles we see is curated by personalized algorithms (like Google News)
  • Sometimes the content itself is personalized, where two people see very different text even though they’re reading the same article

Together, they found that location based personalization is common. Madelyn tells us about three different notifications that NBC news sent to people the morning after the Democratic primary. Not only did national audiences get different notifications, but different cities received notes that mentioned Democrat and Republican candidates differently. Aside from midterms, Madelyn and her colleagues found out that sports news is often location-personalized.

Behavioral Personalization

Madelyn tells us that many news publishers also personalize news articles based on information about their readers, including their reading behavior and surveys. They found that some news publishers personalize messages based on what they consider to be a person’s reading level. They also found evidence that publishers tailor news based on personal information that they never provided to the publisher.

Governing News Personalization

How can we ensure that news publishers are serving democracy in the decisions that they make and the knowledge they contribute to society? In many publishers, decisions about the structure of news personalization are made by the business side of the organization.

Madelyn tells us about future research she hopes to do. She’s looking at the means available to news readers to manage these notifications as well as policy avenues for governing news personalization.

Madelyn also thanks her funders for supporting this collaboration with Yafit Lev-Aretz: the Knight Foundation and the Tow Center for Digital Journalism.

Disaster Information Flows: A Privacy Disaster?

By Madelyn R. Sanfilippo and Yan Shvartzshnaider

Last week, the test of the Presidential Alert system, which many objected to on partisan grounds, brought the Wireless Emergency Alert system (WEA) into renewed public scrutiny. WEA, which distributes mobile push notifications about various emergencies, crises, natural disasters, and amber alerts based on geographic relevance, became operational in 2012, through a public private partnership between the FCC, FEMA, and various telecommunications companies. All customers of participating wireless providers are automatically enrolled, though it is possible to opt out of all but Presidential Alerts.

Presidential alerts were just one of a set of updates designed to address recent events that have connected the trusted communication channel to the fear politics around fake news and misinformation, such as the January 2018 false alarm, when a ballistic missile warning was mistakenly disseminated to the state of Hawaii as a mobile emergency alert. The resulting chaos and outrage, led the FCC to revise protocols for tests of the system, distribution, and emergency alert formats, among other improvements.

In updating WEA, three priorities are addressed: (1) routine “live code testing” to ensure function and minimize confusion; (2) incorporate additional and local participants  into new and existing official channels; and (3) prevent misinformation or false alarms, by authentication, unifying format and overriding opt-out preferences in distributing the presidential alert. The objective is to provide trustworthy information during crises. Yet the specific changes have triggered concerns that allowing partisan officials and mimicking format conventions like character limits undermine the stated objectives, by facilitating imitation for disinformation, rather than engendering confidence in official alerts, as stated in a legal complaint about Presidential Alerts.

With the increased scrutiny around these changes, additional concerns around privacy and surveillance relative to disaster information communication practices have arisen. WEA structures information flows from multiple Federal agencies, along with agency specific Apps, based on aggregated personally identifiable information, including geo-location information, all of which are governed by privacy regulations, including the Privacy Act of 1974, and policies that focus on protecting accidental or malicious disclosure of Personal Identifiable Information (PII) and Sensitive Personally Identifiable Information (SPII). Policies and regulations enumerate a list of trusted partners with which the data will be shared and from whom it may be gathered during emergencies.

The specific types of information that can be gathered about individuals by FEMA, despite the diversity of sources and contexts involved, are precisely defined, such as: name; social media account information; address of geo-location; job title; phone numbers, email addresses, or other contact information; date and time of post; and additional relevant details, including individuals’ physical condition.

Furthermore, information sharing policies governance is less precise with regard to flows, than types.  This is particularly important because expectations change drastically when disaster hits.  Everyday information flows are governed by established norms within a particular context. Yet, disasters change our priorities and norms, as our survival instincts kick in. For example, our norms can oscillate between two extremes; on one side, we do not want to tracked in our daily activities, but during disasters, many feel comfortable broadcasting locations and possibly medical conditions to everyone in the area in order to be found and survive. Previous research had shown that users’ tend to be more lenient towards sharing information, that they normally wouldn’t with emergency services and other relevant agencies that are involved in the recovery situation.

While governance restricts disclosure of personally identifiable information information without users’ explicit consent, disclosures are exempt from asking for an explicit consent if efforts fall under “Routine Use” such as “Disaster Missions.” “Routine use” exclusion has broad implications, given the broad and permissive definitions, including: allowing “information sharing with external partners to allow them to provide benefits and services” (Routine Use H); allowing “FEMA to share information with external partners so FEMA can learn what our external partners have already provided to disaster survivors,” as well as disclosing “applicant information to a 3rd party” in order “To prevent a duplication of benefits” (Routine Use I); and requiring 3rd parties to disclose personal information to FEMA, relative to assistance provided.

The advent of the web along with the popularity of social media present a unique opportunity for agencies like FEMA, as they attempt to leverage new technologies and available user information to assist in preparation and recovery efforts. Increasingly, emergency agencies rely on disaster information flows from and to various opt-in apps–including Nextdoor, which allows calls for help when 911 is down; Life 360, which is helpful in tracking evacuations; and those from the Red Cross–during crises.

Additional categories of supplementary third party services and applications include:

Social networks: FEMA uses public data available on social media to help its operation. Twitter, Google and Facebook are also investing further resources to deliver features for users and emergency services specific to disasters. Apple and Google have also promoted various other emergency and disaster response mobile apps during this ongoing hurricane season.

3rd Party applications: Numerous diverse 3rd parties exist in this increasingly sociotechnical domain of FEMA partnerships relative to disaster communication, response, and recovery. Red Cross Apps provide one of the most popular supplements to WEA notifications and FEMA apps, sharing critical response data with other emergency response organizations and agencies. Ostensibly this standardizes critical information flows between stakeholders. However, it highlights individual users’ privacy concessions and challenges the regulatory schema on-the-books, particularly given that users of many of these emergency apps who opt-in for self-reporting are then tracked persistently, until they opt-out or uninstall, rather than the end of emergency.

IoT devices, drones:   Increasingly, drones and IoT monitor disasters in concert with third party applications are being deployed to complement FEMA and other agencies service in the field.  The information flows between involved stakeholders might not always align with users’ expectations.

In order to better balance pressing public safety concerns with long term consequences we need to understand information flows in practice around disasters. The following questions will be considered in our future work, structured through the contextual integrity framework:

What do disaster information flows look like in practice? There are many diverse official and third party channels. Despite good intentions, few have thoroughly considered whether the information flows they facilitate conform to users’ privacy expectations, or if not, whether they might lead to a privacy disaster, pun intended. This is especially critical in crisis situations, during which safety concerns tend to overshadow individuals’ privacy preferences.

How do rules-in-use about Information flows between stakeholders compare to governance on the books? Loopholes in requiring partners of agencies like FEMA to fully disclose the information they communicate around disasters, including PII and SPII used to personalize communications. Despite the imposed restrictions on gathering personal information and routine uses, it is important to raise additional questions about how broadly permissive social acceptance of reduced privacy under crisis conditions might be conflict with actual understanding of information flows in practice.

Where do we store information and for how long? Temporal aspects of privacy and the persistent location-monitoring associated with emergency channels raise real questions about perceptions on appropriate information flows around disasters and emergencies.

Building Respectful Products using Crypto: Lea Kissner at CITP

How can we build respect into products and systems? What role does cryptography play in respectful design?

Speaking today at CITP is Lea Kissner (@LeaKissner), global lead of Privacy Technology at Google. Lea has spent the last 11 years designing and building security and privacy for Google projects from the grittiest layers of infrastructure to the shiniest user features — and cleaning up when something goes awry. She earned a Ph.D. in cryptography at Carnegie Mellon and a B.S. in CS from UC Berkeley.

As head of privacy at Google, Lea is crafts privacy reviews, defines what privacy means at Google, and leads a team that supports privacy across Google. Her team also creates tools and infrastructure that manage privacy across the company. If you’ve reviewed your privacy on Google, deleted your data, or shared any information with Google, Lea and her team have shaped your experience.

How does Lea think about privacy? When working to build products that respect users, Lea reminds us that it’s important for people to feel safe. This is a full-stack problem, all the way from humans and societies down to the level of hardware. Since society varies widely, people have very expectations around privacy and security, but not in the ways you would anticipate. Lea talks about many assumptions that don’t apply globally: not all languages have a word for privacy, people don’t always have control over their physical devices, and they often operate in settings of conflict.

Lea next talks about the case of online harassment. She describes hate speech as a distributed denial of service attack, a way to suppress speech they don’t like. Many platforms enable this kind of harassment, allowing anyone to send messages to anyone and enabling mass harassment. Sometimes it’s possible for platforms to develop policies to manage these problems, but platforms are often unable to intervene in cases of conflicting values.

Lea tells us about one project she worked on during the Arab uprisings. When people’s faces appeared in videos of protests, those people sometimes faced substantial risks when videos became widely viewed. Lea’s team worked with YouTube to implement software that allowed content creators to blur the faces of people appearing in videos.

Next, Lea describes the ways that her team links research with practical benefits to people. Her team’s ethnographers study differences in situations and norms. These observations shape how her team designs systems. As they create more systems, they then create design patterns, then do user testing on those patterns. Research with humans is important at both ends of the work: when understanding the meaning and nature of the challenges, and when testing systems.

Finally, Lea argues that we need to make privacy and security easy for people to do. Right now, cryptography processes are hard for people to use, and hard for people to implement. Her team focuses on creating systems to minimize the number of things that humans need to do in order to stay secure.

How Cryptography Projects can Fail

Lea next tells us about common failures in privacy and security.

The first way to fail is to create your own cryptography system. That’s a dangerous thing to do, says Lea. Why do people do this? Some think they’re smart and know enough just enough to be dangerous. Some think it’s cool to roll their own. Some don’t understand how cryptography works. Sometimes it seems too expensive (in terms of computation and network) for them to use a third-party system. To make good crypto easier, Lea’s team has created Tink, a multi-language, cross-platform library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.

Lea urges us, “Do me a solid. Don’t give people excuses to roll their own crypto.”

Another area where people fail is in privacy-preserving computation. Lea tells us the story of a feature within Google where people wanted to send messages to someone whose phone number they have. Simple, right? Lea unpacks how complex such features can be, how easy it is to enable privacy breaches, and how expensive it can be to offer privacy. She describes a system that stores a large number of phone numbers associated with user IDs. By storing information with encrypted user IDs, it’s possible to enable people to manage their privacy. When Lea’s team estimated the impact of this privacy feature, they realized that it would require more than all of Google’s total computational power. They’re still working on that one.

Privacy is easier to implement in structured analysis of databases such as advertising metrics, says Lea. Google has had more success adopting privacy practices in areas like advertising dashboards that don’t involve real-time user experiences.

Hardware failures are a major source of privacy and security failures. Lea tells us about the squirrels and sharks that have contributed to Amazon and Yahoo data failures by nibbling on cables. She then talks to us about sources of failures from software errors, as well as key errors. Lea tells us about Google’s Key Management Server, which knows about data objects and the keys that pertain to those objects. Keys in this service need to be accessed quickly and globally.

How do generalized key management servers fail? First, encrypted data compresses poorly. If a million people send each other the same image, a typical storage system can compress it efficiently, storing it only once. An encrypted storage system has to encrypt and store each image individually. Second, people who store information often like to index and search for information. Exact matches are easy, but if you need to retrieve a range of things from a period of time, you need an index, and to create an index, the software needs to know what’s inside the encrypted data. Sharding, backing up, and caching data is also very difficult when information is encrypted.

Next, Lea tells us about the problem of key rotation. People need to be able to change their keys in any usable encryption system. When rotating keys, for every single object, you need to decrypt it using the key and then re-encrypt it using a new key. During this process, you can’t shut down an entire service in order to re-do the encryption. Within a large organization like Google, key rotation should be regular, but if it needs to be coordinated across a large number of people. Lea’s team tried something like this, but it ended up being too complex for the company’s needs. After trying this, they moved key management to the storage level, where it would be possible to manage and rotate keys independently of software teams.

What do we learn from this? Lea tells us that cryptography is a tool for turning things into key management problems. She encourages us to avoid rolling our own cryptography, to design scalable privacy-preserving systems, plan for key management up front, and evaluate the success of a design in the full stack, working from humans all the way to the hardware.