June 19, 2018

New Jersey Takes Up Net Neutrality: A Summary, and My Experiences as a Witness

On Monday afternoon, I testified before the New Jersey State Assembly Committee on Science, Technology, and Innovation, which is chaired by Assemblyman Andrew Zwicker, who also happens to represent Princeton’s district.

On the committee agenda were three bills related to net neutrality.

Let’s quickly review the recent events. In December 2017, the Federal Communications Commission (FCC) recently rolled back the now-famous 2015 Open Internet Order, which required Internet service providers (ISPs) to abide by several so-called “bright line” rules, which can be summarized as (1) no blocking lawful Internet traffic; (2) no throttling or degrading the performance of lawful Internet traffic; (3) no paid prioritization of one type of traffic over another; (4) transparency about network management practices that may affect the forwarding of traffic.  In addition to these rules, the FCC order also re-classified Internet service as a “Title II” telecommunications service—placing it under the jurisdiction of the FCC’s rulemaking authority—overturning the previous “Title I” information services classification that ISPs previously enjoyed.

The distinction of Title I vs. Title II classification is nuanced and complicated, as I’ve previously discussed. Re-classification of ISPs as a Title II service certainly comes with a host of complicated regulatory strings attached.  It also places the ISPs in a different regulatory regime than the content providers (e.g., Google, Facebook, Amazon, Netflix).

The rollback of the Open Internet Order reverted not only the ISPs’ classification of Title II service, but also the four “bright line rules”. In response, many states have recently been considering and enacting their own net neutrality legislation, including Washington, Oregon, California, and now New Jersey. Generally speaking, these state laws are far less complicated than the original FCC order. They typically involve re-instating the FCC’s bright-line rules, but entirely avoid the question of Title II classification.

On Monday, the New Jersey State Assembly considered three bills relating to net neutrality. Essentially, all three bills amount to providing financial and other incentives to ISPs to abide by the bright line rules.  The bills require ISPs to follow the bright line rules as a condition for:

  1.  securing any contract with the state government (which can often be a significant revenue source);
  2. gaining access to utility poles (which is necessary for deploying infrastructure);
  3. municipal consent (which is required to occupy a city’s right-of-way).

I testified at the hearing, and I also submitted written testimony, which you can read here. This was my first experience testifying before a legislative committee; it was an interesting and rewarding experience.  Below, I’ll briefly summarize the hearing and my testimony (particularly in the context of the other testifying witnesses), as well as my experience as a testifying witness (complete with some lessons learned).

My Testimony

Before I wrote my testimony, I thought hard about what a computer scientist with my expertise could bring to the table as a testifying expert. I focused my testimony on three points:

  • No blocking and no throttling are technically simple to implement. One of the arguments that those opposed to the legislation are making is that different state laws on blocking and throttling could become exceedingly difficult to implement, particularly if each state has its own laws. In short, the argument is that state laws could create a complex regulatory “patchwork” that is burdensome to implement. If we were considering a version of the several-hundred-page FCC’s Open Internet Order in each state, I might tend to agree. But, the New Jersey laws are simple and concise: each law is only a couple of pages. The laws basically say “don’t block or throttle lawful content”. There are clear carve-outs for illegal traffic, attack traffic, and so forth. My comments essentially focused on the simplicity of implementation, and that we need not fear a patchwork of laws if the default is a simple rule that simply prevents blocking or throttling. In my oral testimony, I added (mostly for color) that the Internet, by the way, is already a patchwork of tens of thousands of independently operated networks across hundreds of countries, and that our protocols support carrying Internet traffic over a variety of physical media, from optical networks to wireless networks to carrier pigeon. I also took the opportunity to make the point that, by the way, ISPs are in a relative sense, pretty good actors in this space right now, in contrast to other content providers who have regularly blocked access to content either for anti-competitive reasons, or as a condition for doing business in certain countries.
  • Prioritization can be useful for certain types of traffic, but it is distinct from paid prioritization. Some ISPs have been making arguments recently that prohibiting paid prioritization would prohibit (among other things) the deployment of high-priority emergency services over the Internet. Of course, anyone who has taking an undergraduate networking course will have learned about prioritization (e.g., Weighted Fair Queueing), as well as how prioritization (and even shaping) can improve application performance, by ensuring that interactive, delay-sensitive applications such as gaming are not queued behind lower priority bulk transfers, such as a cloud backup. Yet, prioritization of certain classes of applications over others is a different matter from paid prioritization, whereby one customer might pay an ISP for higher prioritization over competing traffic. I discussed the differences at length.I also talked about how prioritization and paid prioritization could more generally: it’s not just about what a router does, but about who has access to what infrastructure. The bills address “prioritization” merely as a packet scheduling exercise—a router services one queue of packets at a faster rate than another queue. But, there are plenty of other ways that some content can be made to “go faster” than others; one such example is the deployment of content across a so-called Content Delivery Network (CDN)—a distributed network of content caches that are close to users. Some application or content providers may enjoy unfair advantage (“priority”) over others merely by virtue of the infrastructure it has access to. Today’s laws—neither the repealed FCC rules nor the state law—do not say anything about this type of prioritization, which could be applied in anti-competitive ways.Finally, I talked about how prioritization is a bit of a red herring as long as there is spare capacity. Again, in an undergraduate networking course, we talk about resource allocation concepts such as max-min fairness, where every sender gets the capacity they require as long as capacity exceeds total demand. Thus, it is also important to ensure that ISPs and application providers continue to add capacity, both in their networks and at the interconnects between their networks.
  • Transparency is important for consumers, but figuring out exactly what ISPs should expose, in a way that’s meaningful to consumers and not unduly burdensome, is technically challenging. Consumers have a right to know about the service that they are purchasing from their ISP, as well as whether (and how well) that service can support different applications. Disclosure of network management practices and performance certainly makes good sense on the surface, but here the devil is in the details. An ISP could be very specific in disclosure. It could say, for example, that it has deployed a token bucket filter of a certain size, fill rate, and drain rate and detail the places in its network where such mechanisms are deployed. This would constitute a disclosure of a network management practice, but it would be meaningless for consumers. On the other hand, other disclosures might be so vague as to be meaningless; a statement from the ISP that says they might throttle certain types of high volume traffic a times of high demand might not be meaningful in helping a consumer figure out how certain applications might perform. In this sense, paragraph 226 of the Restoring Internet Freedom Order, which talks about consumers’ needs to understand how the network is delivering service for the applications that they care about is spot on. There’s only one problem with that provision: Technically, ISPs would have a hard time doing this without direct access to the client or server side of an application. In short: Transparency is challenging. To be continued.

The Hearing and Vote

The hearing itself was a interesting. There were several testifying witnesses opposing the bills: Jon Leibowitz, from Davis Polk (retained by Internet Service Providers); and a representative from US Telecom. The arguments against the bills were primarily legal and business-oriented. Essentially, the legal argument against the bills is that the states should leave this problem to the federal government. The arguments are (roughly) as follows: (1) The Restoring Internet Freedom Order prevents state pre-emption; (2) The Federal Trade Commission has this well-in-hand, now that ISPs are back in Title I territory (and as former commissioner, Leibowitz would know well the types of authority that the FTC has to bring such cases, as well as many cases they have brought against Google, Facebook, and others); (3) The state laws will create a patchwork of laws and introduce regulatory uncertainty, making it difficult for the ISPs to operate efficiently, and creating uncertainty for future investment.

The arguments in opposition to the bill are orthogonal to the points I made in my own testimony. In particular, I disclaimed any legal expertise on pre-emption. I was, however, able to comment on whether I thought the second and third arguments held water from a technical perspective. While the second point about the FTC authority is mostly a legal question, I understood enough about the FTC act, and the circumstances under which they bring cases, to comment on whether technically the bills in question give consumers more power than they might otherwise have with just the FTC rules in place. My perspective was that they do, although this point is a really interesting case of the muddy distinction between technology and the law: To really dive into arguments around this point, it helps to know a bit about both technology and the law. I was able to comment on the “patchwork” assertion from a technical perspective, as I discussed above.

At the end of the hearing, there was a committee vote on all three bills. It was interesting to see both the voting process, and the commentary that each committee member made with their votes.  In the end, there were two abstentions, with the rest in favor. The members who abstained did so largely on the legal question concerning state pre-emption—perhaps foreshadowing the next round of legal battles.

Lessons Learned

Through this experience, I once again saw the value in having technologists at the table in these forums, where the laws that govern the future of the Internet are being written and decided on. I learned a couple of important lessons, which I’ve briefly summarized below.

My job was to bring technical clarity, not to advocate policy. As a witness, technically I am picking a side. And, in these settings, even when making technical points, one is typically doing so to serve one side of a policy or legal argument. Naturally, given my arguments, I registered for a witness in favor of the legislation.

However, and importantly: that doesn’t mean my job was to advocate policy.  As a technologist, my role as a witness is to explain to the lawmakers technical concepts that can help them make better sense of the various arguments from others in the room. Additionally, I steered clear of rendering legal opinions, and where my comments did rely on legal frameworks, I made it clear that I was not an expert in those matters, but was speaking on technical points within the context of the laws, as I understood them.  Finally, when figuring out how to frame my testimony, I consulted many people: the lawmakers, my colleagues at Princeton, and even the ISPs themselves. In all cases, I asked these stakeholders about the topics I might focus on, as opposed to asking what, specifically I should say. I thought hard about what a computer scientist could bring to the discussion, as well as ensuring that what I said was technically accurate and correct.

A simple technical explanation is of utmost importance. In such a committee hearing, advocates and lobbyists abound (on both sides); technologists are rare. I suspect I was the only technologist in the room. Additionally, most of the people in the room have jobs to make arguments that serve a particular stakeholder.  In doing so, they may muddy the waters, either accidentally or intentionally. To advance their arguments, some people may even say things that are blatantly false (thankfully that didn’t happen on Monday, but I’ve seen it happen in similar forums). Perhaps surprisingly, such discourse can fly by completely unnoticed, because the people in the room—especially the decision-makers—don’t have as deep of an understanding of the technology as the technologists.  Technologists need to be in the room, to shed light and to call foul—and, importantly, to do so using accessible language and examples that non-technical policy-makers can understand.

 

Software-Defined Networking: What’s New, and What’s New For Tech Policy?

The Silicon Flatirons Conference on Regulating Computing and Code is taking place in Boulder. The annual conference addresses a range of issues at the intersection of technology and policy and provides an excellent look ahead to the tech policy issues on the horizon, particularly in telecommunications.

I was looking forward to yesterday’s panel on “The Triumph of Software and Software-Defined Networks”, which had some good discussion on the ongoing problem surrounding security and privacy of the Internet of Things (IoT); some of the topics raised echoed points made on a Silicon Flatirons panel last year. My colleague and CITP director Ed Felten made some lucid, astute points about the implications of the “infiltration” of software into all of our devices.

Unfortunately, though (despite the moderator’s best efforts!), the panel lacked any discussion of the forthcoming policy issues concerning Software-Defined Networking (SDN); I was concerned with some of the incorrect comments concerning SDN technology. Oddly, two panelists stated that Software Defined Networking has offered “nothing new”. Here’s one paper that explains some of the new concepts that came from SDN (including the origins of those ideas), and another that talks about what’s to come as machine learning and automated decision-making begin to drive more aspects of network management. Vint Cerf corrected some of this discussion, pointing out one example of a fundamentally new capability: the rise of programmable hardware. One of same panelists also said that SDN hasn’t seen any deployments in the wide-area Internet or at interconnection, a statement that has many counter-examples, including projects such as SDX (and the related multi-million dollar NSF program), Google’s Espresso and B4, and Facebook’s Edge Fabric to name just a few of the public examples.

Some attendees commented that the panel could have discussed how SDN, when coupled with automated decision-making (“AI” in the parlance du jour) presents both new opportunities and challenges for policy. This post attempts to bring some of the issues at the intersection of SDN and policy to light. I address two main questions:

  1. What are the new technologies around SDN that people working in tech policy might want to know about?;
  2. What are some interesting problems at the intersection of SDN and tech policy?

The first part of the post summarizes about 15 years of networking research in three paragraphs, in a form that policy and law scholars can hopefully digest; the second part of the post are some thoughts about new and interesting policy and legal questions—both opportunities and challenges—that these new technologies bring to bear.

SDN: What’s New in Technology?

Software-defined networking (SDN) describes a type of network design where a software program runs separately from the underlying hardware routers and switches can control how traffic is forwarded through the network. While in some sense, one might think of this concept as “nothing new” (after all, network operators have been pushing configuration to routers with Perl scripts for decades), SDN brings several new twists to the table:

  1. The control of a collection of network devices from a single software program, written in a high-level programming language. The notion that many devices can be controlled from a single “controller” creates the ability for coordinated decisions across the network, as opposed to the configuration of each router and switch essentially being configured (and acting) independently. When we first presented this idea for Internet routing in the mid-2000s, this was highly controversial, with some even claiming that this was “failed phone company thinking” (after all, the Internet is “decentralized”; this centralized controller nonsense could only come from the idiots working for the phone company!). Needless to say, the idea is a bit less controversial now. These ideas have taken hold both within the data center, in the wide area, and at interconnection points; technology such as the Software Defined Internet Exchange Point (SDX) makes it possible for networks to exchange traffic only for specific applications (e.g., video streaming), for example, or to route traffic for different application along different paths.
  2. The emergence of programmable hardware in network devices. Whereas conventional network devices relied on forwarding performed by fixed-function ASICs, the rise of companies such as Barefoot Networks have made it possible for network architects to customize forwarding behavior in the network. This capability is already being used for designing network architectures with new measurement and forwarding capabilities, including the ability to get detailed timing information of individual packets as they traverse each network hop, as well as to scale native multicast to millions of hosts in a data center.
  3. The rise of automated decision making in network management (“AI Meets Networking”). For years, network operators have applied machine learning to conventional network security and provisioning problems, including the automated detection of spam, botnets, phishing attacks, bullet-proof web hosting, and so forth. Operators can also use machine learning to help answer complex “what if” performance analysis questions, such as what would happen to web page load or search response time if a server was moved from one region to another, or if new network capacity was deployed. Much of this work, however, has involved developing systems that perform detection in an offline fashion (i.e., based on collected traces). Increasingly, with projects like Google Espresso and Facebook Edge Fabric, we’re starting to see systems that close the loop between measurement and control. It likely won’t be long before networks begin making these kinds of decisions based on even more complex inputs and inferences.

SDN: What’s New for Tech Policy?

The new capabilities that SDN offers presents a range of potentially challenging questions at the intersection of technology, policy, and law.  I’ve listed a few of these interesting possibilities below:

  • Service Level Agreements. A common contractual instrument for Internet Service Providers (ISPs) is the Service Level Agreement (SLA). SLAs typically involve guarantees about network performance: packet loss will never exceed a certain amount, latency will always be less than a certain amount, and so forth. SDN presents both new opportunities and challenges for Service Level Agreements. On the opportunity side, SDN creates the ability for operators to define much more complex traffic forwarding behavior—sending traffic along different paths according to destination, application, or even the conditions of individual links along and end-to-end path at a particular time.

    Yet, the opportunity to create these types of complex SLAs also presents challenges.When all routing and forwarding decisions become automated, and all interconnects look like Google Espresso, where an algorithm is effectively making decisions about where to forward traffic (perhaps based on a huge list of features ranging from application QoE to estimates of user attention, and incorporated into an inscrutable “deep learning” model), how does one go about making sure the SLA continues to be enforced?What new challenges and opportunities do the new capabilities of programmable measurement bring for SLAs? Some aspects of SLAs are notoriously difficult to enforce today.

    Not only will complex SLAs become easier to define, the rise of programmable measurement and advancements in network telemetry will also make SLAs easier for customers to validate. There are huge opportunities in the validation of SLAs, and once these become easier to audit, a whole new set of legal and policy questions will arise.

  • Network Neutrality. Although the Federal Communication Commission (FCC)’s release of the Restoring Internet Freedom order earlier this year effectively rescinds many of the “bright line rules” that we have come to associate with net neutrality (i.e., no blocking, no throttling, no paid prioritization), the order actually leaves in place many transparency requirements for ISPs, requiring ISPs to disclose any practices relevant to blocking, throttling, prioritization, congestion management, application-specific behavior, and security. As with SLA definition and enforcement, network neutrality—and particularly the transparency rule—may become more interesting as SDN makes it possible for network operators to automate network decision-making, as well as for consumers to audit some of the disclosures (or lack thereof) from ISPs. SDX allows networks to make decisions about interconnection, routing, and prioritization based on specific applications, which creates new traffic management capabilities that raise interesting questions in the context of net neutrality; which of these new capabilities would constitute an exception for “reasonable network management practices”, and which might be viewed as discriminatory?

    Additionally, the automation of network management may make it increasingly difficult for operators to figure out what is going on (or why?), and some forwarding decisions may be more difficult to understand or explain if they are driven by a complex feature set and fully automated. Figuring out what “transparency” even means in the context of a fully automated network is a rich area for exploration at the intersection of network technology and telecommunications policy. Even things seemingly as simple as “no blocking” get complicated when networks begin automating the mitigation of attack traffic, or when content platforms begin automating takedown requests. Does every single traffic flow that is blocked by a network intrusion detection system need to be disclosed? How can ISPs best disclose the decision-making process for each blocking decision, particularly when either (1) the algorithm or set of features may be difficult to explain or understand; (2) doing so might aid those who aim to circumvent these network defenses?

Virtualization: A Topic in Its Own Right. The panel moderator also asked a few times about policy and legal issues that arise as a result of virtualization. This is a fantastic question that deserves more attention. It’s worth pointing out the distinction between SDN (which separates network “control plane” software from “data plane” routers and devices) from virtualization (which involves creating virtual server and network topologies on a shared underlying physical network). In short, SDN enables virtualization, but the two are distinct technologies. Nonetheless, virtualization presents many interesting issues at the intersection of technology and policy in its own right. For one, the rise of Infrastructure as a Service (IaaS) providers such as Amazon Web Services, as well as other multi-tenant data centers, introduce questions such as how to enforce SLAs when isolation is imperfect, as well as how IaaS providers can be stewards of potentially private data that may be subject to takedown requests, subpoenas, and other actions by law enforcement and other third parties. The forthcoming Supreme Court case, Microsoft vs. United States, concerning law enforcement access to data stored abroad. Does the data actually live overseas, or this this merely a side effect of global, virtualized data centers? As virtualization is a distinct topic from SDN, the policy issues it presents warrant a separate (future) post.

Summary. In summary, SDN is far from old news, and the policy questions that these new technologies bring to bear are deeply complex and deserve a careful eye from experts at the intersection of policy, law, and technology. We should start these conversations now.

SESTA May Encourage the Adoption of Broken Automated Filtering Technologies

The Senate is currently considering the Stop Enabling Sex Traffickers Act (SESTA, S. 1693), with a scheduled hearing tomorrow. In brief, the proposed legislation threatens to roll back aspects of Section 230 of the Communications Decency Act (CDA), which relieve content providers, or so-called “intermediaries” (e.g., Google, Facebook, Twitter) of liability for the content that is hosted on their platforms. Section 230 protects these platforms from prosecution in federal civil or state courts for the activities of their customers.

One of the corollaries of SESTA is that, with increased liability, content providers might feel compelled to rely more on automated classification filters and algorithms to detect and eliminate unwanted content on the Internet. Having spent more than ten years on developing these types of classifiers to detect “unwanted traffic” ranging from spam to phishing attacks to botnets, I am deeply familiar with the potential—and limitations—of automated filtering algorithms for identifying such content. Existing algorithms can be effective for detecting and predicting certain types of “unwanted traffic”—most notably, attack traffic—but the current approaches to detecting unwanted speech fall far short of being able to reliably detect illegal speech.

Content filters are inaccurate. Notably, the oft-referenced technologies for detecting illegal speech or imagery (e.g., PhotoDNA, EchoPrint), rely on matching content that is posted online against a corpus of content that is known to contain illegal content (e.g., text, audio, imagery). Unfortunately, because these technologies rely on analyzing the content of the posted material. the potential for false positives (i.e., mistakenly identifying innocuous content as illegal) and false negatives (i.e., failing to detect illegal content entirely) are both possible. The network security community has been through this scenario before, in the context of spam filtering: years ago, spam filters would analyze the text of messages to determine whether a particular message was legitimate or spam; it wasn’t long before spammers developed tens of thousands of ways to spell “Rolex” and “Viagra” to evade these filters. They also came up with other creative ways to evade them, by stuffing messages with Shakespeare, and delivering their messages through a variety of formats, ranging from compressed audio to images to spreadsheets.

In short, content-based filters have largely failed to keep up with the agility of spammers.  Evaluation of EchoPrint, for example, suggests that false positive rates are far too high to be used in an automated filtering context. Depending on the length of the file and the type of encoding, error rates are around 1–2 %, where an error could either be a false negative or a false positive. On the other hand, when we were working on spam filters, our discussions with online email service providers suggested that any spam filtering algorithm whose false positive rate exceeded 0.01% would be far too high to avoid raising free speech questions and concerns. In other words, some of the existing automated fingerprinting services that providers might rely on as a result of SESTA might have false positive rates that are many orders of magnitude greater than might otherwise be considered acceptable. We have written extensively about the limitations of these automated filters in the context of copyright.

Content filters cannot identify context. Similarly, today, users who post content online have many tools at their disposal to evade the relatively brittle content-based filters. Detecting unwanted or illegal content on intermediary platforms is even more challenging. Instead of simply classifying unwanted email traffic such as spam (which are typically readily apparent, as they have links to advertisers, phishing sites, and so forth), the challenge on intermediary platforms entails detecting copyright, hate speech, terrorist speech, sex trafficking, and so forth. Yet, simply detecting the presence of something that matches content in a database cannot evaluate considerations fair use, parody, or coercion. Relying on automated content filters will not only produce mistakes in classifying content, but also these filters have no hope of classifying context.

A possible step forward: Classifiers based on network traffic and sending pattens. About ten years ago, we realized the failure of content filters and began exploring how network traffic patterns might produce a stronger signal for illicit activity. We observed that while it was fairly easy for a spammer to change the content of a message it was potentially much more costly for a spammer to change sending patterns, such as email volumes and where messages were originating from and going to. We devised classifiers for email traffic that relied on “network-level features” that now form the basis of many online spam filters. I think there are several grand challenges that lie ahead in determining whether similar approaches could be used to identify unwanted or illegal posts on intermediary content platforms. For example, it might be the case that certain types of illegal speech are characterized by high volumes of re-tweets, short reply times, many instances of repeated messages, or some other feature that is characteristic of the traffic or the accounts that post those messages.

Unfortunately, the reality is that we are far from having automated filtering technology that can reliably detect a wide range of illegal content. Determining how and whether various types of illegal content could be identified remains an open research problem. To suggest that “Any start-up has access to low cost and virtually unlimited computing power and to advanced analytics, artificial intelligence and filtering software.”—a comment that was made in a recent letter to Congress on the question of SESTA—vastly overstates the current state of the art. The bottom line is that whether we can design automated filters to detect illegal content on today’s online platforms is an open research question. A potentially unwanted side effect of SESTA is that intermediaries might feel compelled to deploy these imperfect technologies on their platforms as a result of this law, for fear of liability—thus potentially resulting in over-blocking of legal, legitimate content while failing to effectively deter or prevent the illegal speech that can easily evade today’s content-based filters.

Summary: Automated filters are not “there yet”. Automated filters are often incapable of simply matching content against known offending content, typically because content-based filters are so easily evaded. An interesting question concerns whether other “signals”, such as network traffic and posting patterns, or other characteristics of user accounts (e.g., age of account, number and characteristics of followers) might help us identify illegal content of various types. But, much research is needed before we can comfortably say that these algorithms are even remotely effective at curbing illegal speech. And, even as we work to improve the effectiveness of these automated fingerprinting and filtering technologies, they will likely at best remain an aid that intermediaries might opt to use; I cannot foresee false positive rates ever reaching zero; by no means should we require intermediaries to use these algorithms and technologies in hopes that doing so will eliminate all illegal speech. Doing so would undoubtedly also curb legal and legitimate speech, even as we work to improve them.