February 24, 2018

Software-Defined Networking: What’s New, and What’s New For Tech Policy?

The Silicon Flatirons Conference on Regulating Computing and Code is taking place in Boulder. The annual conference addresses a range of issues at the intersection of technology and policy and provides an excellent look ahead to the tech policy issues on the horizon, particularly in telecommunications.

I was looking forward to yesterday’s panel on “The Triumph of Software and Software-Defined Networks”, which had some good discussion on the ongoing problem surrounding security and privacy of the Internet of Things (IoT); some of the topics raised echoed points made on a Silicon Flatirons panel last year. My colleague and CITP director Ed Felten made some lucid, astute points about the implications of the “infiltration” of software into all of our devices.

Unfortunately, though (despite the moderator’s best efforts!), the panel lacked any discussion of the forthcoming policy issues concerning Software-Defined Networking (SDN); I was concerned with some of the incorrect comments concerning SDN technology. Oddly, two panelists stated that Software Defined Networking has offered “nothing new”. Here’s one paper that explains some of the new concepts that came from SDN (including the origins of those ideas), and another that talks about what’s to come as machine learning and automated decision-making begin to drive more aspects of network management. Vint Cerf corrected some of this discussion, pointing out one example of a fundamentally new capability: the rise of programmable hardware. One of same panelists also said that SDN hasn’t seen any deployments in the wide-area Internet or at interconnection, a statement that has many counter-examples, including projects such as SDX (and the related multi-million dollar NSF program), Google’s Espresso and B4, and Facebook’s Edge Fabric to name just a few of the public examples.

Some attendees commented that the panel could have discussed how SDN, when coupled with automated decision-making (“AI” in the parlance du jour) presents both new opportunities and challenges for policy. This post attempts to bring some of the issues at the intersection of SDN and policy to light. I address two main questions:

  1. What are the new technologies around SDN that people working in tech policy might want to know about?;
  2. What are some interesting problems at the intersection of SDN and tech policy?

The first part of the post summarizes about 15 years of networking research in three paragraphs, in a form that policy and law scholars can hopefully digest; the second part of the post are some thoughts about new and interesting policy and legal questions—both opportunities and challenges—that these new technologies bring to bear.

SDN: What’s New in Technology?

Software-defined networking (SDN) describes a type of network design where a software program runs separately from the underlying hardware routers and switches can control how traffic is forwarded through the network. While in some sense, one might think of this concept as “nothing new” (after all, network operators have been pushing configuration to routers with Perl scripts for decades), SDN brings several new twists to the table:

  1. The control of a collection of network devices from a single software program, written in a high-level programming language. The notion that many devices can be controlled from a single “controller” creates the ability for coordinated decisions across the network, as opposed to the configuration of each router and switch essentially being configured (and acting) independently. When we first presented this idea for Internet routing in the mid-2000s, this was highly controversial, with some even claiming that this was “failed phone company thinking” (after all, the Internet is “decentralized”; this centralized controller nonsense could only come from the idiots working for the phone company!). Needless to say, the idea is a bit less controversial now. These ideas have taken hold both within the data center, in the wide area, and at interconnection points; technology such as the Software Defined Internet Exchange Point (SDX) makes it possible for networks to exchange traffic only for specific applications (e.g., video streaming), for example, or to route traffic for different application along different paths.
  2. The emergence of programmable hardware in network devices. Whereas conventional network devices relied on forwarding performed by fixed-function ASICs, the rise of companies such as Barefoot Networks have made it possible for network architects to customize forwarding behavior in the network. This capability is already being used for designing network architectures with new measurement and forwarding capabilities, including the ability to get detailed timing information of individual packets as they traverse each network hop, as well as to scale native multicast to millions of hosts in a data center.
  3. The rise of automated decision making in network management (“AI Meets Networking”). For years, network operators have applied machine learning to conventional network security and provisioning problems, including the automated detection of spam, botnets, phishing attacks, bullet-proof web hosting, and so forth. Operators can also use machine learning to help answer complex “what if” performance analysis questions, such as what would happen to web page load or search response time if a server was moved from one region to another, or if new network capacity was deployed. Much of this work, however, has involved developing systems that perform detection in an offline fashion (i.e., based on collected traces). Increasingly, with projects like Google Espresso and Facebook Edge Fabric, we’re starting to see systems that close the loop between measurement and control. It likely won’t be long before networks begin making these kinds of decisions based on even more complex inputs and inferences.

SDN: What’s New for Tech Policy?

The new capabilities that SDN offers presents a range of potentially challenging questions at the intersection of technology, policy, and law.  I’ve listed a few of these interesting possibilities below:

  • Service Level Agreements. A common contractual instrument for Internet Service Providers (ISPs) is the Service Level Agreement (SLA). SLAs typically involve guarantees about network performance: packet loss will never exceed a certain amount, latency will always be less than a certain amount, and so forth. SDN presents both new opportunities and challenges for Service Level Agreements. On the opportunity side, SDN creates the ability for operators to define much more complex traffic forwarding behavior—sending traffic along different paths according to destination, application, or even the conditions of individual links along and end-to-end path at a particular time.

    Yet, the opportunity to create these types of complex SLAs also presents challenges.When all routing and forwarding decisions become automated, and all interconnects look like Google Espresso, where an algorithm is effectively making decisions about where to forward traffic (perhaps based on a huge list of features ranging from application QoE to estimates of user attention, and incorporated into an inscrutable “deep learning” model), how does one go about making sure the SLA continues to be enforced?What new challenges and opportunities do the new capabilities of programmable measurement bring for SLAs? Some aspects of SLAs are notoriously difficult to enforce today.

    Not only will complex SLAs become easier to define, the rise of programmable measurement and advancements in network telemetry will also make SLAs easier for customers to validate. There are huge opportunities in the validation of SLAs, and once these become easier to audit, a whole new set of legal and policy questions will arise.

  • Network Neutrality. Although the Federal Communication Commission (FCC)’s release of the Restoring Internet Freedom order earlier this year effectively rescinds many of the “bright line rules” that we have come to associate with net neutrality (i.e., no blocking, no throttling, no paid prioritization), the order actually leaves in place many transparency requirements for ISPs, requiring ISPs to disclose any practices relevant to blocking, throttling, prioritization, congestion management, application-specific behavior, and security. As with SLA definition and enforcement, network neutrality—and particularly the transparency rule—may become more interesting as SDN makes it possible for network operators to automate network decision-making, as well as for consumers to audit some of the disclosures (or lack thereof) from ISPs. SDX allows networks to make decisions about interconnection, routing, and prioritization based on specific applications, which creates new traffic management capabilities that raise interesting questions in the context of net neutrality; which of these new capabilities would constitute an exception for “reasonable network management practices”, and which might be viewed as discriminatory?

    Additionally, the automation of network management may make it increasingly difficult for operators to figure out what is going on (or why?), and some forwarding decisions may be more difficult to understand or explain if they are driven by a complex feature set and fully automated. Figuring out what “transparency” even means in the context of a fully automated network is a rich area for exploration at the intersection of network technology and telecommunications policy. Even things seemingly as simple as “no blocking” get complicated when networks begin automating the mitigation of attack traffic, or when content platforms begin automating takedown requests. Does every single traffic flow that is blocked by a network intrusion detection system need to be disclosed? How can ISPs best disclose the decision-making process for each blocking decision, particularly when either (1) the algorithm or set of features may be difficult to explain or understand; (2) doing so might aid those who aim to circumvent these network defenses?

Virtualization: A Topic in Its Own Right. The panel moderator also asked a few times about policy and legal issues that arise as a result of virtualization. This is a fantastic question that deserves more attention. It’s worth pointing out the distinction between SDN (which separates network “control plane” software from “data plane” routers and devices) from virtualization (which involves creating virtual server and network topologies on a shared underlying physical network). In short, SDN enables virtualization, but the two are distinct technologies. Nonetheless, virtualization presents many interesting issues at the intersection of technology and policy in its own right. For one, the rise of Infrastructure as a Service (IaaS) providers such as Amazon Web Services, as well as other multi-tenant data centers, introduce questions such as how to enforce SLAs when isolation is imperfect, as well as how IaaS providers can be stewards of potentially private data that may be subject to takedown requests, subpoenas, and other actions by law enforcement and other third parties. The forthcoming Supreme Court case, Microsoft vs. United States, concerning law enforcement access to data stored abroad. Does the data actually live overseas, or this this merely a side effect of global, virtualized data centers? As virtualization is a distinct topic from SDN, the policy issues it presents warrant a separate (future) post.

Summary. In summary, SDN is far from old news, and the policy questions that these new technologies bring to bear are deeply complex and deserve a careful eye from experts at the intersection of policy, law, and technology. We should start these conversations now.

Making Sense of Child Protection Predictive Models: Tech-Soc Reading Group Feb 20

How are predictive models transforming how we think about child protection, and how should we think about the role of such systems in a democracy?

If you’re interested to ask these questions, join us at 2-3pm on Tuesday, Feb 20th at Sherrerd Hall room 306 for our opening Technology and Society Reading group meeting. The conversation is open to anyone who’s interested, and who’s willing to do the reading in advance.  (if you’re interested, come for our lunch talk that day as well, which focuses on privacy & safety concerning intimate partner violence)

In 2016, Allegheny County, which includes the city of Pittsburgh, became the first region in the US to use a predictive model to “identify families most in need of intervention” for their department of  Children, Youth and Families. These models create a risk score for children who might be at risk of violence and guide decisions about followup by government employees. In January 2018, three very different perspectives  were published widely about this system: a New York Times Magazine article, a chapter in Virginia Eubanks’s new book Automating Inequality (blog post here), and a new academic paper by the creators of the system that outlines the work they’ve done to make the system transparent and fair.

About the Technology and Society Reading Group

At the Center for Information Technology Policy, we bring together people with expertise in technology, engineering, policy, and the social sciences to ask those questions and develop new approaches together.  You can sign up for the mailing list here, where you will receive reading material for discussion before the event.

This semester, conversations will be organized by CITP fellows J. Nathan Matias and Ben Zevenbergen.

If you have ideas for a conversation you would like to see us discuss, please email Nathan () and Ben () with suggestions, and some references that you think might be interested

Time: every other Tuesday from 2pm-3pm, starting Feb 20th

How Data Science and Open Science are Transforming Research Ethics: Edward Freeland at CITP

How are data science and  open science movement transforming how researchers manage research ethics? And how are these changes influencing public trust in social research?


I’m here at the Center for IT Policy to hear a talk by Edward P. Freeland. Edward is the associate director of the Princeton University Survey Research Center and a lecturer at the Woodrow Wilson School of Public and International Affairs. Edward has been a member of Princeton’s Institutional Review Board since 2005 and currently serves as chair.

Edward starts out by telling us about about his family’s annual Christmas card. Every year, his family loses track of a few people, and he ends up having to try to track someone down. For several years, they sent the postcard to Ed’s wife’s cousin Billy to someone in Hartford CT, but it turns out that the address was not their cousin Billy but a retired neurosurgeon. To resolve this problem this year, Edward and his wife filled out more information about their family members into an app. Along the way, he learned just how much information about people is available on the internet. While technology makes it possible to keep track of family members more easily, some of that data might be more than people want to be known.

How does this relate to research ethics? Edward tells us about the principles that currently shape research ethics in the United States. These principles come from the 1978 Belmont Report, which was prompted in party by the Tuskeegee Syphilis Study, a horrifying medical study that ran for forty years. In the US, universities now have to do research focused on respect for persons, beneficence, and justice.

In practice, what do university ethics boards (IRBs) care about? Edward and his colleagues compiled a list of the issues that ethics boards into a single slide:

When it comes to privacy, what to university ethics boards care about? Federal regulations focus on any disclosure of the human subjects’ responses outside of the research and the risk that it would expose people to. In practice, the ethics board expects researchers to adopt procedural safeguards around who can access data and how it’s protected.

In the past, studies would basically conclude after the researchers publish the research. But the practice of research has been changing. Advocates of open science have worked to reduce fraud, prevent burying of unexpected results, enhance funder/taxpayer impact, strengthen, the integrity of scientific work, work through crowdsourcing or citizen science, and collaborate in new ways. Edward tells about the Open Science Collaboration, which tried in 2015 to replicate a hundred studies from across psychology, and who often failed to do so. Now others are trying to ask similar questions across other fields including cancer research.

In just a few years, the Center for Open Science has supported many researchers and journals to pre-register and publish the details of their research. Other organizations are also developing similar initiatives, such as clinicaltrials.gov.

Many in the open science movement suggest that researchers archive and share data, even after submitting a manuscript. Some people use a data sharing agreement to protect data used by others. Others prepare datafiles from their research for public use. But publishing data introduces privacy risks for participants in research. While US legislation HIPAA covers medical data, there aren’t authoritative norms or guidelines around sharing that data.

Many people turn to anonymization as a way to protect the information of people who participate in research. But does it really work? The landscape of data re-identification is changing from year to year, but the consensus is that anonymization doesn’t tend to work. As Matt Salganik points out in his book Bit By Bit, we should assume that all data are potentially identifiable and potentially sensitive. Where might we need to be concerned about potential problems?

  • People are sometimes recruited to join survey panels where they answer many questions over the years. Because this data is highly-dimensional, it may be very easy to re-identify people
  • Distributed anonymous workforces like Amazon Mechanical Turk also represent a privacy risk. The ID codes aren’t anonymous: you can google people’s IDs and find people’s comments on various Amazon products
  • Re-identification attacks, which draw together data from many sources to find someone, are becoming more common

Public Confidence in Science

How we treat people’s data affects public confidence in science– not only how people interpret what we learn, but also people’s likelihood to participate in research. Edward tells us that survey response rates have been dropping, even when surveys are conducted by the government. American society has always had a fringe movement of people who resisted government data collection. If those people gain access to the levers of power, they may be able to influence the government’s likelihood to collect data that could inform the public on important issues.

Edward tells us that very few people expect their data to be kept private and secure, according to research by Pew. When combined with declining trust in institutions, concerns about privacy may be one reason that fewer people are responding to surveys.

At the same time, many people are organizing to try to resist surveying by the US government. Some political and activist groups have been filming their interactions with survey collectors, harassing them, and claiming that researchers or the government have secret. As researchers try to uphold public trust by doing trustworthy, beneficial research, we need to be aware of the social and political forces that influence how people think about research.