October 6, 2022

Archives for May 2017

What does it mean to ask for an “explainable” algorithm?

One of the standard critiques of using algorithms for decision-making about people, and especially for consequential decisions about access to housing, credit, education, and so on, is that the algorithms don’t provide an “explanation” for their results or the results aren’t “interpretable.”  This is a serious issue, but discussions of it are often frustrating. The reason, I think, is that different people mean different things when they ask for an explanation of an algorithm’s results.

Before unpacking the different flavors of explainability, let’s stop for a moment to consider that the alternative to algorithmic decisionmaking is human decisionmaking. And let’s consider the drawbacks of relying on the human brain, a mechanism that is notoriously complex, difficult to understand, and prone to bias. Surely an algorithm is more knowable than a brain. After all, with an algorithm it is possible to examine exactly which inputs factored in to a decision, and every detailed step of how these inputs were used to get to a final result. Brains are inherently less transparent, and no less biased. So why might we still complain that the algorithm does not provide an explanation?

We should also dispense with cases where the algorithm is just inaccurate–where a well-informed analyst can understand the algorithm but will see it as producing answers that are wrong. That is a problem, but it is not a problem of explainability.

So what are people asking for when they say they want an explanation? I can think of at least four types of explainability problems.

The first type of explainability problem is a claim of confidentiality. Somebody knows relevant information about how a decision was made, but they choose to withhold it because they claim it is a trade secret, or that disclosing it would undermine security somehow, or that they simply prefer not to reveal it. This is not a problem with the algorithm, it’s an institutional/legal problem.

The second type of explainability problem is complexity. Here everything about the algorithm is known, but somebody feels that the algorithm is so complex that they cannot understand it. It will always be possible to answer what-if questions, such as how the algorithm’s result would have been different had the person been one year older, or had an extra $1000 of annual income, or had one fewer prior misdemeanor conviction, or whatever. So complexity can only be a barrier to big-picture understanding, not to understanding which factors might have changed a particular person’s outcome.

The third type of explainability problem is unreasonableness. Here the workings of the algorithm are clear, and are justified by statistical evidence, but the result doesn’t seem to make sense. For example, imagine that an algorithm for making credit decisions considers the color of a person’s socks, and this is supported by unimpeachable scientific studies showing that sock color correlates with defaulting on credit, even when controlling for other factors. So the decision to factor in sock color may be justified on a rational basis, but many would find it unreasonable, even if it is not discriminatory in any way. Perhaps this is not a complaint about the algorithm but a complaint about the world–the algorithm is using a fact about the world, but nobody understands why the world is that way. What is difficult to explain in this case is not the algorithm, but the world that it is predicting.

The fourth type of explainability problem is injustice. Here the workings of the algorithm are understood but we think they are unfair, unjust, or morally wrong. In this case, when we say we have not received an explanation, what we really mean is that we have not received an adequate justification for the algorithm’s design.  The problem is not that nobody has explained how the algorithm works or how it arrived at the result it did. Instead, the problem is that it seems impossible to explain how the algorithm is consistent with law or ethics.

It seems useful, when discussing the explanation problem for algorithms, to distinguish these four cases–and any others that people might come up with–so that we can zero in on what the problem is. In the long run, all of these types of complaints are addressable–so that perhaps explainability is not a unique problem for algorithms but rather a set of commonsense principles that any system, algorithmic or not, must attend to.

Innovation in Network Measurement Can and Should Affect the Future of Internet Privacy

As most readers are likely aware, the Federal Communications Commission (FCC) issued a rule last fall governing how Internet service providers (ISPs) can gather and share data about consumers that was recently rolled back through the Congressional Review Act. The media stoked consumer fear with headlines such as “For Sale: Your Private Browsing History” and comments about how ISPs can now “sell your Web browsing history to advertisers“. We also saw promises from large ISPs such as Comcast promising not to do exactly that. What’s next is anyone’s guess, but technologists need not stand idly by.

Technologists can and should play an important role in this discussion in several ways.  In particular, conveying knowledge about the capabilities and uses of network monitoring, and developing both new monitoring technologies and privacy-preserving capabilities can and should shape this debate in three important ways: (1) Level-setting on the data collection capabilities of various parties; (2) Understanding and limiting the power of inference; and (3) Developing new monitoring technologies that help facilitate network operations and security while protecting consumer privacy.

1. Level-setting on data collection uses and capabilities. Before entering a debate about privacy, it helps to have a firm understanding of who can collect what types of data—both in theory and in practice, as well as the myriad ways that data might be used for good (and bad). For example, in practice, if anyone has your browsing history, your ISP is a less likely culprit than an online service provider such as Google—who operates a browser, and (perhaps more importantly) whose analytics scripts are on a large fraction of the Internet’s web pages. Your browsing is also likely being logged by many of the countless online trackers that keep track of your browsing history, often without your knowledge or consent. In contrast, the network monitoring technology that is available in routers and switches today makes it a lot more difficult to extract “browsing history”; that requires a technology commonly referred to as “deep packet inspection” (DPI), or complete capture of network traffic data, which is expensive to deploy, and even more costly when data storage and analysis is concerned. Most ISPs will tell you than DPI is deployed on only a small fraction of the links in their networks, and that fraction is going down as speeds are increasing; it’s expensive to collect and analyze all of that data.

ISPs do, of course, collect other types of traffic statistics, such as lookups to domain names via the Domain Name System (DNS) and coarse-grained traffic volume statistics via IPFIX. That data can, of course, be revealing. At the same time, ISPs will correctly point out that monitoring DNS and IPFIX is critical to securing and operating the network. DNS traffic, for example, is central to detecting denial of service attacks or infected devices. IPFIX statistics are critical for monitoring and mitigating network congestion. DNS is a quintessential example of data that is both incredibly sensitive (because it reveals the domains and websites we visit, among other things, and is typically unencrypted) and incredibly useful for detecting attacks, ranging from phishing to denial of service attacks.

The long line of security and traffic engineering research illustrates both the importance of data collection, as well as the limitations of current network monitoring capabilities in performing these tasks. Take, for example, research on botnet detection, which has shown the power of using DNS lookup data and IPFIX statistics for detecting compromise and intrusion. Or, the development of traffic engineering capabilities in the data center and in the wide area, which depend on the collection and analysis of IPFIX records and in some cases packet traces.

2. Understanding (and mitigating) the power of inference. While most of the focus in the privacy debate thus far concerns data collection (specifically, a focus on DPI, which is somewhat misguided per the discussion above), we would be wise to also consider what can be inferred from any data that is collected. For example, various aspects of “browsing history” could be evident from various datasets ranging from DNS to DPI, but as discussed above all of these datasets also have legitimate operational uses. Furthermore, “browsing history” is evident from a wide range of datasets that many parties are privy to without our consent, beyond just ISPs. Such inference capabilities are only going to increase with the proliferation of data-producing Internet-connected devices coupled with advances in machine learning. If prescriptive rules specify which some types of data can be collected, we risk over-prescribing rules, while failing to achieve the goal of protecting the higher-level information that we really want to protect.

While asking questions about collection is a fine place to start a discussion, we should be at least as concerned with how the data is usedwhat it can be used to infer, and who it is shared with.We likely should be asking: (1) What data do we think should be protected or private? (2) What types network data permits inference of that private data? (3) Who has access to that data and under what circumstances? Suppose that I am interested in protecting information about whether I am at home. My ISP could learn this information from my traffic patterns, simply based on the decline in traffic volume from individual devices, even if all of my web traffic were encrypted, and even if I used a virtual private network (VPN) for all of my traffic. Such inference will be increasingly possible as more devices in our homes connect to the Internet. But, online service providers could also come to know the same information without my consent, based on different data; Google, for example, would know that I’m browsing the web at my office, rather than at home, through the use of technologies such as cookies, browser fingerprinting, and other online device tracking mechanisms.

Past and ongoing research, such as the Web Transparency and Accountability Project, as well as the “What They Know” series from the Wall Street Journal, shed important light on what can be inferred from various digital data sources. The Upturn report last year was similarly illuminating with respect to ISP data. More recently, researchers at Princeton including Noah Apthorpe and Dillon Reisman have been developing techniques to mitigate the power of inference using various traffic shaping and camouflaging techniques to limit what an ISP can infer from traffic patterns coming from a home network.

3. Facilitating purpose-driven network measurement and data minimization. Part of the tension surrounding network measurement and privacy is that current network monitoring technology is very crude; in fact, this technology hasn’t changed considerably in nearly 30 years. It at once gathers too much data, and yet, for many purposes, it is still too little. Consider, for example, that with current network monitoring technology, an ISP (or content provider) have incredible difficulty determining a user’s quality of experience for a given application, such as video streaming, simply because the wrong kind of data is collected, at the wrong granularity. As a result, ISPs (and many other parties in the Internet ecosystem) adopt a post hoc “collect first, ask questions later” approach, simply because current network monitoring technology (1) is oriented towards offline processing on warehoused data; (2) does not make it easy to figure out what data is needed to answer a particular analysis question.

Instead, network data collection could be driven by the questions operators were asking; data could be collected if—and only if—it were pertinent to a specific question or network operations task, such as monitoring application performance or detecting attacks. For example, suppose that an operator could ask a query such as “tell me the average packet loss rate of all Netflix video streams for subscribers in Seattle”. Answering such a query with today’s tools is challenging: one would have to collect all packet traces and all DNS queries and somehow identify post hoc that these streams correspond to the application of interest. In short, it’s difficult, if not impossible, answer such an operational query today without large-scale collection and storage of (very sensitive) data—all to find what is essentially a needle in a haystack.

Over the past year, my Ph.D. student Arpit Gupta at Princeton has been leading the design and development of a system called Sonata that may ultimately resolve this dichotomy and give us the best of both worlds. Two emerging technologies—(1) in-band network measurement, as supported by Barefoot’s Tofino chipset; (2) scalable streaming analytics platforms such as Spark—make it possible to write a high-level query in advance and only collect the data that is needed to satisfy the query. Such technology allows a network operator to write a query in a high-level language (in this case, Scala), specifying only the question, but allowing the runtime to figure out the minimal set of raw data that is needed to satisfy the operator’s query.

Our goal in the design and implementation of Sonata was to satisfy the operational and scaling limitations of network measurement, but achieving such scalability also has data minimization effects that have positive benefits for privacy. Data that is collected can also be a liability; it may, for example, become the target of law enforcement requests or subpoenas, which parties such as ISPs, but also online providers such as Google are regularly subject to. Minimizing the collected data to only that which is pertinent to operational queries can also ultimately help reduce this risk.

Sonata is open source, and we welcome contributions and suggestions from the community about how we can better support specific types of network queries and tasks.

Summary. Network monitoring and analytics technology is moving at a rapid pace, in terms of its capabilities to help network operators answer important questions about performance and security, without coming at the cost of consumer privacy. Technologists should devote attention to developing new technologies that can help achieve the best of both worlds, and on helping educate policymakers about the capabilities (and limitations) of existing network monitoring technology. Policymakers should be aware that network monitoring technology continues to advance, and should focus discussion around protecting what can be inferred, rather than focusing only on who can collect a packet trace.

Multiple Intelligences, and Superintelligence

Superintelligent machines have long been a trope in science fiction. Recent advances in AI have made them a topic for nonfiction debate, and even planning. And that makes sense. Although the Singularity is not imminent–you can go ahead and buy that economy-size container of yogurt–it seems to me almost certain that machine intelligence will surpass ours eventually, and quite possibly within our lifetimes.

Arguments to the contrary don’t seem convincing. Kevin Kelly’s recent essay in Backchannel is a good example. His subtitle, “The AI Cargo Cult: The Myth of a Superhuman AI” implies that AI of superhuman intelligence will not occur. His argument centers on five “myths”:

  1. Artificial intelligence is already getting smarter than us, at an exponential rate.
  2. We’ll make AIs into a general purpose intelligence, like our own.
  3. We can make human intelligence in silicon.
  4. Intelligence can be expanded without limit.
  5. Once we have exploding superintelligence it can solve most of our problems.

He rebuts these “myths” with five “heresies” :

  1. Intelligence is not a single dimension, so “smarter than humans” is a meaningless concept.
  2. Humans do not have general purpose minds, and neither will AIs.
  3. Emulation of human thinking in other media will be constrained by cost.
  4. Dimensions of intelligence are not infinite.
  5. Intelligences are only one factor in progress.

This is all fine, but notice that even if all five “myths” are false, and all five “heresies” are true, superintelligence could still exist.  For example, superintelligence need not be “like our own” or “human” or “without limit”–it only needs to outperform us.

The most interesting item on Kelly’s lists is heresy #1, that intelligence is not a single dimension, so “smarter than humans” is a meaningless concept. This is really two claims, so let’s consider them one at a time.

First, is intelligence a single dimension, or are there different aspects or skills involved in intelligence?  This is an old debate in human psychology, on which I don’t have an informed opinion. But whatever the nature and mechanisms of human intelligence might be, we shouldn’t assume that machine intelligence will be the same.

So far, AI practice has mostly treated intelligence as multi-dimensional, building distinct solutions to different cognitive challenges. Perhaps this is fundamental, and machine intelligence will always be a bundle of different capabilities. Or perhaps there will be a future unification of some sort, to create a single capability that can outperform people on all or nearly all cognitive tasks. At this point it seems like an open question whether machine intelligence is inherently multi-dimensional.

The second part of Kelly’s claim is that, assuming intelligence is multi-dimensional, “smarter than humans” is a meaningless concept. This, to put it bluntly, is not correct.

To see why, consider that playing center field in baseball requires multi-dimensional skills: running, throwing, distinguishing balls from strikes, hitting accurately, hitting with power, and so on. Yet every single major league center fielder is vastly better than I am at playing center field, because they dominate me by far in every one of the component skills.

Like playing center field, intelligence may be multi-dimensional, and yet one entity can be more intelligent than another by being superior in every dimension.

What this suggests about the future of machine intelligence is that we may live for quite a while in a state where machines are better than us at some aspects of intelligence and we are better than them at others. Indeed, that is the case now, and has been for years.

If machine intelligence remains multi-dimensional, then machines will surpass our intelligence not at a single point in time, but gradually, and in more and more dimensions of intelligence.