September 29, 2022

Building Respectful Products using Crypto: Lea Kissner at CITP

How can we build respect into products and systems? What role does cryptography play in respectful design?

Speaking today at CITP is Lea Kissner (@LeaKissner), global lead of Privacy Technology at Google. Lea has spent the last 11 years designing and building security and privacy for Google projects from the grittiest layers of infrastructure to the shiniest user features — and cleaning up when something goes awry. She earned a Ph.D. in cryptography at Carnegie Mellon and a B.S. in CS from UC Berkeley.

As head of privacy at Google, Lea is crafts privacy reviews, defines what privacy means at Google, and leads a team that supports privacy across Google. Her team also creates tools and infrastructure that manage privacy across the company. If you’ve reviewed your privacy on Google, deleted your data, or shared any information with Google, Lea and her team have shaped your experience.

How does Lea think about privacy? When working to build products that respect users, Lea reminds us that it’s important for people to feel safe. This is a full-stack problem, all the way from humans and societies down to the level of hardware. Since society varies widely, people have very expectations around privacy and security, but not in the ways you would anticipate. Lea talks about many assumptions that don’t apply globally: not all languages have a word for privacy, people don’t always have control over their physical devices, and they often operate in settings of conflict.

Lea next talks about the case of online harassment. She describes hate speech as a distributed denial of service attack, a way to suppress speech they don’t like. Many platforms enable this kind of harassment, allowing anyone to send messages to anyone and enabling mass harassment. Sometimes it’s possible for platforms to develop policies to manage these problems, but platforms are often unable to intervene in cases of conflicting values.

Lea tells us about one project she worked on during the Arab uprisings. When people’s faces appeared in videos of protests, those people sometimes faced substantial risks when videos became widely viewed. Lea’s team worked with YouTube to implement software that allowed content creators to blur the faces of people appearing in videos.

Next, Lea describes the ways that her team links research with practical benefits to people. Her team’s ethnographers study differences in situations and norms. These observations shape how her team designs systems. As they create more systems, they then create design patterns, then do user testing on those patterns. Research with humans is important at both ends of the work: when understanding the meaning and nature of the challenges, and when testing systems.

Finally, Lea argues that we need to make privacy and security easy for people to do. Right now, cryptography processes are hard for people to use, and hard for people to implement. Her team focuses on creating systems to minimize the number of things that humans need to do in order to stay secure.

How Cryptography Projects can Fail

Lea next tells us about common failures in privacy and security.

The first way to fail is to create your own cryptography system. That’s a dangerous thing to do, says Lea. Why do people do this? Some think they’re smart and know enough just enough to be dangerous. Some think it’s cool to roll their own. Some don’t understand how cryptography works. Sometimes it seems too expensive (in terms of computation and network) for them to use a third-party system. To make good crypto easier, Lea’s team has created Tink, a multi-language, cross-platform library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.

Lea urges us, “Do me a solid. Don’t give people excuses to roll their own crypto.”

Another area where people fail is in privacy-preserving computation. Lea tells us the story of a feature within Google where people wanted to send messages to someone whose phone number they have. Simple, right? Lea unpacks how complex such features can be, how easy it is to enable privacy breaches, and how expensive it can be to offer privacy. She describes a system that stores a large number of phone numbers associated with user IDs. By storing information with encrypted user IDs, it’s possible to enable people to manage their privacy. When Lea’s team estimated the impact of this privacy feature, they realized that it would require more than all of Google’s total computational power. They’re still working on that one.

Privacy is easier to implement in structured analysis of databases such as advertising metrics, says Lea. Google has had more success adopting privacy practices in areas like advertising dashboards that don’t involve real-time user experiences.

Hardware failures are a major source of privacy and security failures. Lea tells us about the squirrels and sharks that have contributed to Amazon and Yahoo data failures by nibbling on cables. She then talks to us about sources of failures from software errors, as well as key errors. Lea tells us about Google’s Key Management Server, which knows about data objects and the keys that pertain to those objects. Keys in this service need to be accessed quickly and globally.

How do generalized key management servers fail? First, encrypted data compresses poorly. If a million people send each other the same image, a typical storage system can compress it efficiently, storing it only once. An encrypted storage system has to encrypt and store each image individually. Second, people who store information often like to index and search for information. Exact matches are easy, but if you need to retrieve a range of things from a period of time, you need an index, and to create an index, the software needs to know what’s inside the encrypted data. Sharding, backing up, and caching data is also very difficult when information is encrypted.

Next, Lea tells us about the problem of key rotation. People need to be able to change their keys in any usable encryption system. When rotating keys, for every single object, you need to decrypt it using the key and then re-encrypt it using a new key. During this process, you can’t shut down an entire service in order to re-do the encryption. Within a large organization like Google, key rotation should be regular, but if it needs to be coordinated across a large number of people. Lea’s team tried something like this, but it ended up being too complex for the company’s needs. After trying this, they moved key management to the storage level, where it would be possible to manage and rotate keys independently of software teams.

What do we learn from this? Lea tells us that cryptography is a tool for turning things into key management problems. She encourages us to avoid rolling our own cryptography, to design scalable privacy-preserving systems, plan for key management up front, and evaluate the success of a design in the full stack, working from humans all the way to the hardware.

PrivaCI Challenge: Context Matters

by  Yan Shvartzshnaider and Marshini Chetty

In this post, we describe the Privacy through Contextual Integrity (PrivaCI) challenge that took place as part of the symposium on applications of contextual integrity sponsored by Center for Information Technology Policy and Digital Life Initiative at Princeton University. We summarize the key takeaways from the unfolded discussion.

We welcome your feedback on any of the aspects of the challenge, as we seek to improve the challenge to serve as a pedagogical and methodological tool to elicit discussion around privacy in a systematic and structured way.

See below the Additional Material and Resources section for links to learning more about the theory of Contextual Integrity and the challenge instruction web page.

What Is the PrivaCI Challenge?

The PrivaCI challenge is designed for evaluating information technologies and to discuss legitimate responses. It puts into practice the approach formulated by the theory of Contextual Integrity for providing “a rigorous, substantive account of factors determining when people will perceive new information technologies and system as threats to privacy (Nissenbaum, H., 2009).”

In the symposium, we used the challenge to discuss and evaluate recent-privacy relevant events. The challenge included 8 teams and 4 contextual scenarios. Each team was presented with a use case/context scenario which then they discussed using the theory of CI. This way each contextual scenario was discussed by a couple of teams.

 

PrivaCI challenge at the symposium on applications of Contextual Integrity

 

To facilitate a structured discussion we asked the group to fill in the following template:

Context Scenario: The template included a brief summary of a context scenario which in our case was based on one of the four privacy news related stories with a link to the original story.

Contextual Informational Norms and privacy expectations: During the discussion, the teams had to identify the relevant contextual information norms and privacy expectations and provide examples of information flows violating these norms.

Example of flows violating the norms: We asked each flow to be broken down into relevant CI Params, i.e., Identify the actors involved (senders, receivers, subjects), Attributes, Transmission Principle.

Possible solutions: Finally, the teams were asked to think of possible solutions to the problem which incorporates previous or ongoing research projects of your teammates.

What Were The Privacy-Related Scenarios Discussed?

We briefly summarize the four case studies/privacy-related scenarios and discuss some of the takeaways here from the group discussions.

  1. St. Louis Uber driver has put a video of hundreds of his passengers online without letting them know.
    https://www.stltoday.com/news/local/metro/st-louis-uber-driver-has-put-video-of-hundreds-of/article_9060fd2f-f683-5321-8c67-ebba5559c753.html
  2. “Saint Louis University will put 2,300 Echo Dots in student residences. The school has unveiled plans to provide all 2,300 student residences on campus (both dorms and apartments).”
    https://www.engadget.com/2018/08/16/saint-louis-university-to-install-2300-echo-dots/
  3. Google tracks your movements even if users set the settings to prevent it. https://apnews.com/828aefab64d4411bac257a07c1af0ecb
  4. Facebook asked large U.S. banks to share financial information on their customers.
    https://www.wsj.com/articles/facebook-to-banks-give-us-your-data-well-give-you-our-users-1533564049

 

Identifying Governing Norms

Much of the discussion focused on the relevant governing norms. For some groups, identifying norms was a relatively straightforward task. For example, in the Uber driver scenario, a group listed: “We do not expect to be filmed in private (?) spaces like Uber/Lyft vehicles.” In the Facebook case, one of the groups articulated a norm as “Financial information should only be shared between financial institutions and individuals, by default, AND Facebook is a social space where personal financial information is not shared.”

Other groups, could not always identify norms that were violated. For example, in the same “Google tracks your movements, like it or not” scenario, one of the teams could not formulate what norms were breached. Nevertheless, they felt uncomfortable with the overall notion of being tracked. Similarly, a group analyzing the scenario where “Facebook has asked large U.S. banks to share detailed financial information about their customers” found that the notion of an information flow traversing between social and financial spheres unacceptable. Nevertheless, they were not sure about the governing norms.

The unfolded discussion included whether norms usually correspond to “best” practice, due diligence. It might be even possible for Facebook to claim that it is all legal and no laws were breached in the process, but this by itself does not mean there was no violation of a norm.

We emphasized the fact that norms are not always grounded in law. An information flow can still violate a norm, despite being specified in a privacy policy or even if it is considered legal, or a “best” practice. Norms are influenced by many other factors. If we feel uneasy about an information flow, it probably violates some deeper norm that we might not be consciously aware of. This requires a deeper analysis.

Norms and privacy expectations vary among members of groups and across groups

The challenge showcases the norms and privacy expectations may vary. Some members of the group, and across groups, had different privacy expectations for the same context scenario. For example, in the Uber scenario, some members of the group, expected drivers to film their passengers for security purposes, while others did not expect to be filmed at all. In this case, we followed the CI decision heuristic which “recommends assessing [alternative flows’] respective merits as a function of the of their meaning and significance in relation to the aims, purposes, and values of the context.” It was interesting to see how by explaining the values of a “violating” information flows, it was possible to get the members of the team to consider their validity in a certain context under very specific conditions. For example, it might be acceptable for a taxi driver to record their passengers onto a secure server (without Internet access) for safety reasons.

Contextual Integrity offers a framework to capture contextual information norms

The challenge revealed additional aspects regarding the way groups approach the norm identification task. Two separate teams listed the following statement as norms: “Consistency between presentation of service and actual functioning,” and “Privacy controls actually do something.” These outline general expectations and fall under the deceptive practice of the Federal Trade Commission (FTC) act; nevertheless these expectations are difficult to capture and asses using the CI framework because they do not articulate in terms of appropriate information flows. This also might be a limitation of the task itself, due to time limitation, the groups were asked to articulate the norms in general sentences, rather than specify them using the five CI parameters.

Norm violating information flows

Once norms were identified, the groups were asked to specify possible information flows that violate them. It was encouraging to see that most teams were able to articulate the violating information flows in a correct manner, i.e., specifying the parameters that correspond to the flow. A team working on the Google’s location tracking scenario could pinpoint the violating information flow: Google should not generate flow without users’ awareness or consent, i.e., the flow can happen under specific conditions. Similar violations identified in other scenarios. For example, in the case, where an Uber driver was streaming live videos of his passengers onto the internet site. Here also the change in transmission principle and the recipient prompted a feeling of privacy violation among the group.

Finally, we asked the groups to propose possible solutions to mitigate the problem. Most of the solutions included asking users for permissions, notifying or designing an opt-in only system. The most critical takeaway from the discussion on the fact that norms and users’ privacy expectation evolve as new information flows are introduced, their merits need to be discussed in terms of the functions they serve.

Summary

The PrivaCI Challenge was a success! It served as an icebreaker for the participants to know each other a little better and also offered a structured way to brainstorm and discuss specific cases. The goal of the challenge exercise was to introduce a systematic way of using the CI framework to evaluate a system in a given scenario. We believe similar challenges can be used as a methodology to introduce and discuss Contextual Integrity in an educational setting or even possibly during the design stage of a product to reveal possible privacy violations.

Additional material and resources

You can access the challenge description and the template here: http://privaci.info/ci_symposium/challenge

The symposium program is available here.

To learn more about the theory of Contextual Integrity and how it differs from other existing privacy frameworks we recommend reading “Privacy in Context: Technology, Policy, and the Integrity of Social Life” by Helen Nissenbaum.

To participate in the discussion on CI, follow @privaci_way on Twitter.
Visit the website: http://privaci.info
Join the privaci_research mailing list.

References

Nissenbaum, H., 2009. Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

 

How can we scale private, smart contracts? Ed Felten on Arbitrum

Smart contracts are powerful virtual referees for holding money and carrying out agreed-on procedures in cases of disputes, but they can’t guarantee privacy and have strict scalability limitations. How can we improve on these constraints?

Here at the Center for IT Policy, it’s the first event of our weekly Tuesday lunch series. Speaking today is Professor Ed Felten, director of CITP. Ed served at the White House as the deputy U.S. chief technology officer from June 2015 to January 2017. Ed was also the first chief technologist for the Federal Trade Commission from January 2011 until September 2012.

What is cryptocurrency? Ed describes a situation where Alice wants to share money with Bob. She digitally signs a data structure indicating that coin C should be paid to Bob’s address, and she sends it to the Bitcoin network. The systems in the network then gossip to each other that Alice wants to pay Bob.

This brings us to the blockchain. The blockchain is a data structure that includes information about transactions and a link to a previous block. Each block includes a cryptographic hash to the previous block, and if anyone accepts the block, they accept the rest of the chain. When Alice creates a transaction, it will be added to a block by a bitcoin miner. This miner then tries to succeed at getting their block to the blockchain- if the miner succeeds, then Alice’s transaction is accepted and it will be deemed to have happened.  That’s how Bitcoin works- it keeps track of all previous transactions, and that’s how it keeps track of currency.

Smart contracts are another blockchain idea- but it’s a misnomer. Here’s how it works. If Alice and Bob want to make an agreement and have a protocol for carrying it out, they write down computer code that defines the behavior of a third party. One way to do it is to have a trusted third party carry out that protocol. A smart contract creates a virtual third party, writes code describing what it should do, and then instantiates it into the blockchain system. Then, if all goes well, the contract will behave according to its code, and it will act as the third party or referee in the agreement between Alice and Bob. Ed shows this with a pile of money because one thing it can do is to receive coins, own them, and do whatever with those coins that its code has defined. The contract, in this sense, is a trusted third party expressed in code.

What can smart contracts do? One option is escrow. Maybe Alice wants to buy books and doesn’t want to pay until she receives the book– but maybe the shop will only ship after payment. This is typically what an escrow agent does. In the optimistic case, the escrow agent receives the money and transfers the money to the shop once the books have been received. Smart contracts can play the role of the escrow agent. Why set this up in code? In theory, an escrow agent defined in code will be less likely to carry out fraud.

Smart contracts can also support sealed-bid auctions. Ed asks us to imagine that someone is selling naming rights to a cafeteria. Everyone submits bids secretly, the “envelopes” are opened at the end of the bid, and whoever bid the most wins. Smart contracts can give people assurances that people will carry out key actions in the process by requiring them to provide a deposit, where they know what will happen with their money

The most popular smart contract system is Ethereum, in which all contract code and data is public. Every miner emulates every execution step of every contract. That is slow, expensive, and doesn’t scale, so Ethereum requires people to pay what they call “gas” – in exchange for computation and storage done by a contract. The high cost to the miners of emulating these steps translates to a high cost of gas.

Contract complexity on Ethereum is capped by a “global gas limit” – defining the maximum amount of contract work that the miners are able to do. Roughly speaking, Ed says, the total computational capacity of Ethereum is less than a tenth of a laptop. These scalability limitations make many protocols impossible, and blockchain space is very limited.

Ethereum also has privacy limitations- Bitcoin scripts and Ethereum code are all public. Not everyone wants the full details of every contract to be visible to everyone. In some cases, you might want something more like a traditional business contract, where the contract terms are normally only known to the parties.

Can we scale smart contracts? That’s what the Arbitrum team was trying to do. To make clear what the team is doing, Ed describes three areas where someone could do work. Rather than focus on the consensus level, the Arbitrum team focused on scaling the smart contracts.

  • Kalodner, H., Goldfeder, S., Chen, X., Weinberg, S. M., & Felten, E. W. (2018, August). Arbitrum: scalable, private smart contracts. In Proceedings of the 27th USENIX Conference on Security Symposium (pp. 1353-1370). USENIX Association.

How can you scale smart contracts? Ed’s team worked on an off-chain protocol. The work is performed out-of-band by the transacting parties. The computation and storage are done off-chain. All of these things need to be linked back to the chain.

Ed quickly summarizes approaches that have been taken, including SNARKs, Incentivized Verification (TrueBit), and State Channels. He goes more in-depth about TrueBit. In this system of incentivized verifiers, a group of “verifiers” volunteer to check computations. They are rewarded more if they find errors. Anyone can be a verifier, and the reward is split among them. If a computation checked by a verifier is incorrect, the verifier can give an efficient proof of incorrectness.

But there’s a participation dilemma to incentivized verifiers. Imagine a game-theory situation where there are N players, who can pay 1 to participate. Imagine that a participating verifier pretends to be more than one verifier (sybils). In that situation, if you have enough people wearing different sybil masks, people are disincentivized from being a verifier. The creators of TrueBit have shown that their system is “one-shot sybil proof.” As a result, if someone claims to be two people, they get two shares of the reward, but the shares are smaller, so that it would have been more profitable to claim (honestly) to be a single party.

Verification is a repeated game; in these cases, a verifier might sacrifice something in one situation in order to gain over long time. In their paper, Ed and his collaborators shared a game theoretic proof showing that every one-shot sybil proof participation game allows a situation where one verifier can bully all other players into not participating by flooding the system with fake verifiers.

The limits of other approaches show why Ed and his collaborators created Arbitrum, which uses a combination of protocol design, incentives, and a virtual machine infrastructure to carry out scalable, trustworthy smart contracts. Arbitrum starts by assuming an underlying consensus layer, which they call a “verifier.”

The Arbitrum system is built around Managers, who manage a virtual machine that carries out computation and data. Arbitrum provides an “any-trust” guarantee; as long as at least one manager of a VM is honest, the VM will execute correctly according to its code.

Imagine that Bob and Alice are going to play a chess competition. They create code that holds the gold medal, receives alternating moves, verifies the validity of the game, and pays the winner. Bob and Alice put the code onto a VM. Who are the managers in this situation? Alice and Bob can be the managers, and so long as they can hold each other accountable, the contract will work.

How can managers in Arbitrum cooperate to advance the state of a VM? Managers have incentives to agree unanimously about what a VM will do. If they all agree and digitally sign the assertion, the system accepts their assertions, since the system assumes that at least one manager is acting honestly. What if managers dispute the claim? A manager can make an assertion and deposit some funds. Another manager can challenge the assertion and also deposit some funds. If there’s a challenge, the system referees the dispute and takes the deposit of the manager that was lying. When a challenge happens, the asserter divides their assertion in half and the challenger must identify which half of the process was incorrect. Eventually, the dispute is narrowed from a large process into a single instruction. The system can then check the one-instruction claim to find out who’s lying.

By dividing the dispute down to a single instruction, Ed says it’s possible to decide the dispute efficiently in a way that minimizes privacy leaks. He then describes the data structure in Arbitrum that stores the state of a program as a tree of cryptographically-stored information. Conventional virtual machines store code and data in ways that require logarithmic time to verify instructions. First, Arbitrum stores data in fixed sized “tuples” that can be arranged in a tree structure. Second, application code manages the tree rather than the VM emulator. In the typical VM, a single instruction takes O(log n) to execute. In Arbitrum, it takes O(log n) instructions to execute something but each instruction takes constant time. And because Arbitrum narrows down verification to a single instruction, resolving a dispute can take constant time.

The state of a VM is revealed only to the VM’s managers– for example, Alice and Bob would be the only people who need to know what moves were made in the chess game. The only things that appear on the chain are: saltable hashes of the VM state, the number and timing of the steps, and the messages/money sent and received by the VM.

The Arbitrum team has implemented this system with 6,800 lines of Go code, a VM emulator, assembler, and loader. They have an honest manager module that makes and defends assertions. Their proof of concept uses a centralized verifier for simplicity, but you could easily replace this pluggable module that allows multiple verifiers. They also have an Arbitrum standard library.

How well does this scale? Ed describes an example contract, showing that at the high end, Arbitrum can work at roughly a million times the performance of Ethereum. Ed thinks it’s the only system that provides scalability, privacy, and a programmable modules for writing smart contracts.

Questions

After the talk, I asked Ed if collaborations like this are common- ones that bring together game-theoretical mechanism design, cryptography, and algorithm/data structure design. Ed responded that most cryptocurrency work does combine these things. What makes Arbitrum unusual, Ed explained, is the way in which the research team re-designed the VM in a way that makes the protocol more scalable. It’s hard for people to keep all of those things in mind, and Ed says that it’s easy to get things wrong– which is why peer review is so important in cryptocurrency research.

The Rise of Artificial Intelligence: Brad Smith at Princeton University

What will artificial intelligence mean for society, jobs, and the economy?

Speaking today at Princeton University is Brad Smith, President and Chief Legal Officer of Microsoft. I was in the audience and live-blogged Brad’s talk.

CITP director Ed Felten introduces Brad’s lecture by saying that the tech industry is at a crossroads. With the rise of AI and big data, people have realized that the internet and technology are having a big, long-term effect on many people’s lives. At the same time, we’ve seen increased skepticism about technology and the role of the tech industry in society.

The good news, says Ed, is that plenty of people in the industry are up to the task of explaining what the industry does to cope with these problems in a productive way. What the industry needs now, says Ed, is what Brad offers: a thoughtful approach to the challenges that our society faces, acknowledges the role of tech companies, seeks constructive solutions, and takes responsibility that works across society. If there’s one thing we could to to help the tech industry cope with these questions, says Ed, it would be to clone Brad.

Imagining Artificial Intelligence in Thirty Years

Brad opens by mentioning the new book by his team: The Future Computed Artificial Intelligence and its Role in Society. While writing the book, they realized that it’s not helpful to think about change in the next year or two. Instead, we should be thinking about periods of ten to thirty years.

What was life like twenty years ago? In 1998, people often began their day without anything digital. They would put on a television, listen to the radio, and pull out a calendar. If you needed to call someone, you would use a land phone to reach them. At that time, the single common joke was about whether they could program their VCR machines.

In 2018, the first thing that many people reach for is their phone. Even if you manage to keep your phone in another room, you’ll find yourself reaching for your phone or sitting down in front of your laptop. You now use those devices to find out what happened in the world and with your friends.

What will the world look like in 2038? By that time, Brad argues that we’ll be living with artificial intelligence. Digital assistants are already part of our lives, but they’ll be more common at that time. Rather than looking at lots of apps, we’ll have a digital assistant that will talk to us and tell us what the traffic will be like for us. Twenty years from now, you’ll probably have your digital assistant talking to you as you shave or put on your makeup in the morning.

What is Artificial Intelligence?

To understand what that mean in our lives, we need to understand what artificial intelligence really is. Even today, computers can recognize people, and they can do more – they can make sense of someone’s emotions from their face. We’ve seen the same with the ability of computers to understand language, Brad says. Not only can computers recognize speech, they can also sift through knowledge, make sense of it, and reach conclusions.

In the world today, we read about AI and expect it all to arrive one day, says Brad. That’s not how it’s going to work- AI will become more and more part of our lives in pieces. He tells us about the BMW pedestrian alert, which allows cars to detect pedestrians, beep, signal to the driver, and apply its brakes. Brad also tells us about the Steno app, which records and transcribes. Microsoft now has a version of Skype that detects and auto-translates the conversation– something they’ve now integrated with Powerpoint. Spotify, Netflix, and iTunes all use artificial intelligence to deliver suggestions for the next TV show. None of these systems work with 100% perfection, but neither do human beings.  When asking about an AI system, we need to ask when computers will become as good as a human being.

What advances make AI real? Microsoft Amazon, Google, and others build data centers that are many football fields large in space. This enables companies to gather huge computational power and vast amounts of data. Because algorithms get better with more data, companies have an insatiable appetite for data.

The Challenges of Imagining the Future

All of this is exciting, says Brad, and could deliver huge promise for the world. But we can’t afford to look at this future with uncritical eyes. The world needs to make sense of the risks. As computers behave more like humans, what will that mean for real people? Many people like Stephen Hawking, Elon Musk, and others are warning us about that future. But there is no crystal ball. For a long time, says Brad, I’ve admired futurists, but if a futurist gets something wrong, probably nobody remembers they got it wrong. We may be able to discern patterns, but nobody has a crystal ball.

Learning from The History of the Automobile

How can we think about what may be coming? The first option is to learn from history– not because it repeats itself but because it provides insights. To illustrate this, Brad starts by talking about the transition from horses to automobiles. He shows us a photo of Bertha Benz, whose dowry paid for her husband Karl’s new business. One morning in 1888, she got up and left her husband a note saying that she was taking the car and driving the kids 70 kilometers to visit her mother. Before the day was over, she had to repair the car, but by the end of the day, they had reached her mother’s house. This stunt convinced the world that the automobile would be important to the future.

Next, Brad shows us a photo of New York City in 1905, with streets full of horses and hardly any cars. Twenty years later, there were no horses on the streets. The horse population declined and jobs involved in supporting them disappeared. These direct economic effects weren’t as important as the indirect effects. Consumer credit wasn’t necessarily connected to the automobile, but it was an indirect outcome. Once people wanted to buy cars, they needed a way to finance the cars. Advertising also changed: when people were driving past billboards at speed, advertisers invented logos to make their companies more recognizable.

How Institutions Evolve to Meet Technology & Economic Changes

The effects of the automobile weren’t all good. As the population of horses declined, farmers got smart and grew less hay. They shifted their acre-age to wheat and corn and the prices plummeted. Once the prices plummeted, farmers’ income plummeted. As the farmers fell behind on their loans, the rural banks tried to foreclose them, leading to broad financial collapse. Many of the things we take for granted today come from that experience: the FDIC and insurance regulation, farm subsidies, and many other parts of our infrastructure. With AI, we need to be prepared for changes as substantial.

Understanding the Impact of AI on the Economy

Brad tells us another story about how offices worked. In the 1980s, you handed someone a hand-written document and someone would type it for you. Between the 1980s and today, two big changes happened. First, secretarial staff went on the decline and the professional IT staff was born. Second, people realized that everyone needed to understand how to use computers.

As we think about how work will change, we need to ask what jobs AI will replace. To answer this question, let’s think about what computers can do well: vision, speech, language knowledge. Jobs involving decision-making are already being done by computers (radiology, call centers, fast food orders, auto drivers). Jobs involving translation and learning will also become automated, including machinery inspection and the work of paralegals. At Microsoft, the company used to have multiple people whose job was to inspect fire extinguishers. Now the company has devices that automatically record data on their status, reducing the work involved in maintaining them.

Some jobs are less likely to be replaced by AI, says Brad: anything that requires human understanding and empathy. Nurses, social workers, therapists, and teachers are more likely to be people who will use AI than be replaced by it. This may lead people to take on jobs that they take more satisfaction in doing.

Some of the most exciting developments for AI in the next five years will be in the area of disability. Brad shows us a project called “Seeing AI,” offers an app that describes a person’s surroundings using a phone camera. The app can read barcodes and identify food, identify currency bills, describe a scene, and read text in one’s surroundings. What’s exciting is what it can do for people. The project has already carried out 3 million tasks and it’s getting better and smarter as it goes. This system could be a game changer for people with blindness, says Brad.

Why Ethics Will Be a Growth Area for AI

What jobs will AI create? It’s easier to think about the jobs it will replace than what it will create. When young people in Kindergarten today enter the workplace, he says, the majority of jobs will be ones that don’t yet exist. Some of the new jobs will be ones that support AI to work: computer science, data science, and ethics. “Ultimately, the question is not only what computers *can* do” says Brad, “it’s what computers *should* do.” Under the ethics of AI, the fields of reliability/safety and privacy/security are well developed. Other important areas that are less well developed are research on fairness, inclusiveness. Two issues underly all the rest. Transparency is important because the world needs to know how those systems will work– people need to understand how they work.

AI Accountability and Transparency

Finally, one of the most important questions of our time is: “how do we ensure accountability of machines”- will we ensure that machines will be accountable to people, and will those people be accountable to other people? Only with accountability will be able to

What would it mean to create a hippocratic oath for AI developers? Brad asks: what does it take to train a new generation of people to work on AI with that kind of commitment and principle in mind? These aren’t just questions for people at big tech companies. As companies, governments, universities, and individuals take the building blocks of AI and use them, AI ethics are becoming important to every part of society.

Artificial Intelligence Policy

If we are to stay true to timeless values, says Brad, we need to ask the question about whether we only want ethical people to behave ethically, or everyone to behave ethically? That’s what law does; AI will create new questions for public policy and the evolution of the law. That’s why skilling up for the future isn’t just about science, technology, engineering, and math: as computers behave more like humans, the social sciences and humanities will become even more important. That’s why diversity in the tech industry is also important, says Brad.

How AI is Transforming the Liberal Arts, Engineering, and Agriculture

Brad encourages us to think about disciplines that AI can make more impactful: Ai is changing healthcare (cures for cancer), agriculture (precision farming), accessibility, and our environment. He concludes with two examples. First, Brad talks about the Princeton Geniza Lab, led by Marina Rustow, who are using AI to analyze documents that have been scattered all around the world. Using AI, researchers are joining these digitized fragments. Engineering isn’t only for the engineers– everybody in the liberal arts can benefit from learning a little bit of computer science and data science, and every engineer is going to need some more liberal arts in their future. Brad also  tells us about the AI for Earth project which provides seed funds to researchers who work on the future of the planet. Projects include smart grids in Norway that make energy usage more efficient, a project by the Singaporean government to do smart climate control in buildings, and a project in Tasmania that supports precision farming, saving 30% on irrigation costs.

These examples give us a glimpse on what it means to prepare for an AI powered future, says Brad. We’re also going to need to do more work: we may need a new social contract, because people are going to need to learn new skills, find new career pathways, create new labor rules and protections, and rethink the social safety net as these changes ripple throughout the economy.

Creating the Future of Artificial of Intelligence

Where will AI take us? Brad encourages students to think about the needs of the world and what AI has to offer. It’s going to take a whole generation to think through what AI has to offer and create that future, and he encourages today’s students to sieze that challenge.

How Tech is Failing Victims of Intimate Partner Violence: Thomas Ristenpart at CITP

What technology risks are faced by people who experience intimate partner violence? How is the security community failing them, and what questions might we need to ask to make progress on social and technical interventions?

Speaking Tuesday at CITP was Thomas Ristenpart (@TomRistenpart), an associate professor at Cornell Tech and a member of the Department of Computer Science at Cornell University. Before joining Cornell Tech in 2015, Thomas was an assistant professor at the University of Wisconsin-Madison. His research spans a wide range of computer security topics, including digital privacy and safety in intimate partner violence, alongside work on cloud computing security, confidentiality and privacy in machine learning, and topics in applied and theoretical cryptography.

Throughout this talk, I found myself overwhelmed by the scope of the challenges faced by so many people– and inspired by the way that Thomas and his collaborators have taken thorough, meaningful steps on this vital issue.

Understanding Intimate Partner Violence

Intimate partner violence (IPV) is a huge problem, says Thomas. 25% of women and 11% of men will experience rape, physical violence, and/or stalking by an intimate partner, according to the National Intimate Partner and Sexual Violence Survey. To put this question in context for tech companies, this means that 360 million Facebook users and 252 million Android users will experience this kind of violence.

Prior research over the years has shown that abusers are taking advantage of technology to harm victims in a wide range of ways, including spyware, harassment, and non-consensual photography. In a team with Nicki Dell, Diana FreedKaren Levy, Damon McCoy, Rahul Chatterjee, Peri Doerfler, and Sam Havron, Thomas and his collaborators have working with the New York City Mayor’s office to Combat Domestic Violence (NYC CDV).

To start, the researchers spent a year doing qualitative research with people who experience domestic violence. The research that Thomas is sharing today draws from that work.

The research team worked with the New York City Family Justice Centers, who offer a range of services for domestic violence, sex trafficking, and elder abuse victims– from civil and legal services to access to shelters, counseling, and support from nonprofits. The centers were a crucial resource for the researchers, since they connect nonprofits, government actors, and survivors and victims. Over seriesof year-long qualitative studies (see also this paper), researchers held 11 focus groups with 39 women who speak English and Spanish from 18-165. Most of them are no longer working with the abusive partner. They also held semi-structured interviews with 50 professionals working on IPV– case managers, social workers, attorneys/paralegals, and police officers. Together, this research represents the largest and most demographically diverse study to date on IPV.

Common Technology Attacks in Intimate Partner Violence Situations

The researchers spotted a range of common themes across clients of the NYC CDV. They talked about stalkers who accessed their phones and social media, installed spyware, took compromising images through the spyware, and then impersonating them to use the account to send compromising, intimate images to employers, family, and friends. Abusers are taking advantage of every possible technology to create problems through many modes. Overall, they identified four kinds of common attacks:

  • In ownership-based attacks, the abuser owns the account that the victim is using. This gives them immediate access to controlling the device. Often people will buy a device for someone else to gain a foothold in that person’s life and home.
  • In account/device compromise, someone compels, guesses, or otherwise compromises passwords.
  • Harmful messages or posts involve calling/texting/messaging the victim. This involves harassing a victim’s friends/family, and sometimes encouraging other people to harass that person by proxy.
  • Abusers also exposed private information: blackmailing someone by threat of exposure, sharing non-consensual intimate images, and creating fake profiles/advertisements for that person on other sites.

In many of these cases, abusers are re-purposing ordinary software for some kind of unhelpful purpose. For example, abusers use two-factor authentication to prevent victims from accessing and recovering access to their own account.

Non-Technical Infrastructures Aren’t Helping Victims & Professionals with Technical Issues

Thomas tells us that despite these risks, they didn’t find a single technologist in the network of support for people facing intimate partner violence. So it’s not surprising that these services don’t have any best practices for evaluating technology risks. On top of that, victims overwhelmingly report having insufficient technology understanding to deal with tech abuse.

Abusers are typically considered to be “more tech-savvy” than victims, and professionals overwhelmingly report having insufficient technology understanding to help with tech abuse. Many of them just google as they go.

Thomas also points out that the intersection of technology and intimate partner violence raises important legal and policy issues. First, digital abuse is usually not recognized as a form of abuse that warrants a protection order. When someone goes to a family court, they have to convince a judge to get a protection order- and judges aren’t convinced by digital harassment– even though the protection order can legally restrict an abuser from sending the message. Second, when an abuser creates a fake account on a site like Tinder and creates “come rape me” style ads, the abuser is technically the legal owner of the account, so it can be difficult to take down the ads, especially for smaller websites that don’t respond to copyright takedown requests.

Technical Mechanisms are Failing Too: Context Undermines Existing Security Systems

Abusers aren’t the sophisticated cyber-operatives that people sometimes talk about at security conferences. Instead, researchers saw two classes of attacks: (a) UI-bound adversaries: an adversarial but authenticated user who interacts with the system via the normal user interface, and (b) Spyware adversaries, who installs/repurposes commodity software for surveillance of the victim. Neither of these require technical sophistication.

Why are these so effective? Thomas says that the reason is that the threat models and the assumptions in the security world don’t match threats. For example, many systems are designed to protect from a stranger on the internet who doesn’t know the victim personally and connects from elsewhere. With intimate partner violence, the attacker knows the victim personally, they can guess or compel disclosure, they may connect from the victim’s computer or same home, and may own the account or device that’s being used. The abuser is often an earner who pays for accounts and devices.

The same problems apply with fake accounts and detection of abusive content. Many fake social media profiles obviously belong to the abuser but survivors are rarely able to prove it. When abusers send hurtful, abusive messages, someone who lacks the content may not be able to detect it. Outside of the context of IPV, a picture of a gun might be just a picture of a gun- but in context, it can be very threatening.

Common Advice Also Fails Victims

Much of the common advice just won’t work. Sometimes people are urged to delete their account. You can’t just shut off contact with an abuser- you might be legally obligated to communicate (shared custody of children). You can’t get new devices because the abuser pays for phones, family plan, and/or children’s devices (which is a vector of surveillance). People can’t necessarily get off social media, because they need it to get access to their friends and family. On top of that, any of these actions could escalate abuse; victims are very worried about cutting off access or uninstalling spyware because they’re worried about further violence from the abuser.

Many Makers of Spyware Promote their Software for Intimate Partner Surveillance

Next, Thomas tells us about intimate partner surveillance (IPS) from a new paper led by Diana Freed on How Intimate Partner Abusers Exploit Technology. Shelters and family justice centers have had problems where someone shows up with software on their phone that allowed the abuser to track them, kick down a door, and endanger the victim. No one could name a single product that was used by abusers, partly because our ability to diagnose spyware from a technical perspective is limited. On the other hand, if you google “track my girlfriend,” you will find a host of companies that are peddling spyware.

To study the range of spyware systems, Thomas and his colleagues used “snowball” searching and used auto-complete to look for other queries that other people were searching. From a set of roughly 27k urls, they investigated 100 randomly sampled URLs. They found that 60% were related to intimate partner surveillance: how-to blogs, Q&A forums, news articles, app websites, and links to apps on the Google Play Store and the Apple App Store. Many of the professional-grade spyware providers provide apps directly through app stores, as well as “off-store” apps. They labeled a thousand of the apps they found and discovered that about 28% of them were potential IPS tools.

The researchers found overt tools for intimate partner surveillance apps, as well as systems for safety, theft-tracking, child tracking, and employee tracking that were repurposed for abuse. In many cases, it’s hard to point to a single piece of software and say that it’s bad. While apps sometimes purport to provide services to parents to track children, searches for intimate partner violence also surface paid ads to products that don’t directly claim to be for use within intimate partners. Ever since a ruling from the FTC, companies work to preserve plausible deniability.

In an audit study the researchers emailed customer support for 11 apps (on-store and off-store) posing as an abuser. They received nine responses. Eight of them condoned intimate partner violence and gave them advice on making the app hard to find. Only one indicated that it could be illegal.

Many of these systems have rich capabilities: location tracking, texts, call recordings, media contents, app usage, internet activity logs, keylogging, geographic tracking. All of the off-store systems have covert features to hide the fact that the app is installed. Even some of the Google Play Store apps have features to make the apps covert.

Early Steps for Supporting Victims: Detecting Spyware

What’s the current state of the art? Right now, practitioners tell people that if your battery runs unusually low, they may be a victim of spyware– not very effective. Do spyware removal tools work? They had high but not perfect detection rates for off-store intimate-purpose surveillance systems. However they did a poor job at detecting on-store spyware tools.

 

Thomas recaps what they learned from this study: There’s a large ecosystem of spyware apps, the dual use of these apps creates a significant challenge, many developers are condoning intimate partner surveillance, and existing anti-spyware technologies are insufficient at detecting tools.

Based on this work, Thomas and his collaborators are working with the NYC Mayor’s office and the National Network to end Domestic Violence to develop ways to detect spyware, to develop new surveys of technology risks, and find new kinds of interventions.

Thomas concludes with an appeal to companies and computer scientists that we pay more attention to the needs of the most vulnerable people affected by our work, volunteer for organizations that support victims, and develop new approaches to protect people in these all-too-common situations.