August 5, 2021

Archives for 2020

New Research on Privacy and Security Risks of Remote Learning Software

This post and the paper is jointly authored by Shaanan Cohney, Ross Teixeira, Anne Kohlbrenner, Arvind Narayanan, Mihir Kshirsagar, Yan Shvartzshnaider, and Madelyn Sanfilippo. It emerged from a case study at CITP’s tech policy clinic.

As universities rely on remote educational technology to facilitate the rapid shift to online learning, they expose themselves to new security risks and privacy violations. Our latest research paper, “Virtual Classrooms and Real Harms,” advances recommendations for universities and policymakers to protect the interests of students and educators.

The paper develops a threat model that describes the actors, incentives, and risks in online education. Our model is informed by our survey of 105 educators and 10 administrators who identified their expectations and concerns. We use the model to conduct a privacy and security analysis of 23 popular platforms using a combination of sociological analyses of privacy policies and 129 state laws (available here), alongside a technical assessment of platform software.

Our threat model diagrams typical remote learning data flows. An “appropriate” flow is informed by established educational norms. The flow marked end-to-end encryption represents data that is not ordinarily accessible to the platform.

In the physical classroom, there are educational norms and rules that prevent surreptitious recording of the classroom and automated extraction of data. But when classroom interactions shift to a digital platform, not only does data collection become much easier, the social cues that discourage privacy harms are weaker and participants are exposed to new security risks. Popular platforms, like Canvas, Piazza, and Slack, take advantage of this changed environment to act in ways that would be objectionable in the physical classroom—such as selling data about interactions to advertisers or other third parties. As a result, the established informational norms in the educational context are severely tested by remote learning software.

We analyze the privacy policies of 23 major platforms to find where those policies conflict with educational norms. For example, 41% of the policies permitted a platform to share data with advertisers, which conflicts with at least 21 state laws, while 23% allowed a platform to share location data. However, the privacy policies are not the only documents that shape platform practices. Universities use Data Protection Addenda (DPAs) for the institutional licenses that they negotiate with the platform to supplement or even supplant the default privacy policy. We reviewed 50 DPAs from 45 Universities, finding that the addenda were able to cause platforms to significantly shift their data practices, including stricter limits on data retention and use.

We also discuss the limitations of current federal and state regulation to address the risks we identified. In particular, the current laws lack specific guidance for platforms and educational institutions to protect privacy and security and have limited penalties for noncompliance. More broadly, the existing legal framework is geared toward regulating specific information types and a small subset of actors, rather than specifying transmission principles for appropriate use that would be more durable as the technology evolves.

What can be done to better protect students and educators? We offer the following five recommendations:

  1. Educators should understand that there are significant differences between free (or individually licensed) versions of software and institutional versions. Universities need to work on informing educators about those differences and encourage them to use institutionally-supported software.
  2. Universities should use their ability to negotiate DPAs and institute policies to make platforms modify their default practices that are in tension with institutional values.
  3. Crucially, universities should not spend all their resources on a complex vetting process before licensing software. That path leads to significant usability problems for end users, without addressing the security and privacy concerns. Instead, universities should recognize that significant user issues tend to surface only after educators and students have used the platforms and create processes to collect those issues and have the software developers rapidly fix the problems.
  4. Universities should establish clear principles for how software should respect the norms of the educational context and require developers to offer products that let them customize the software for that setting.
  5. Federal and state regulations can be improved by making platforms more accountable for compliance with legal requirements, and giving institutions a mandate to require baseline security practices, much like financial institutions have to protect consumer information under the Federal Trade Commission’s Safeguards Rule.

The shift to virtual learning requires many sacrifices from educators and students already. As we integrate these new learning platforms in our educational systems, we should ensure they reflect established educational norms and do not require users to sacrifice usability, security, and privacy.

We thank the members of Remote Academia and the university administrators who participated in the study. Remote Academia is a global Slack-based community, that gives faculty and other education professionals a space to share resources and techniques for remote learning. It was created by Anne, Ross, and Shaanan.

How programmers communicate through code, legally

Computer programming, especially in source code, is an expressive form of communication. As such, U.S. law recognizes that communication in the form of source code is protected as freedom of speech by the First Amendment. Recently, Judge G. Murray Snow got this only two-thirds right in a ruling in the U.S. District Court in Arizona. In the case of CDK Global v. Brnovich, his denial of a motion to dismiss reads (in part),

It is well-established that “computer code, and computer programs constructed from code can merit First Amendment protection.” Universal City Studios, Inc. v. Corley, 273 F.3d 429, 449 (2d Cir. 2001)see also United States v. Elcom Ltd., 203 F. Supp. 2d 1111, 1127 (N.D. Cal. 2002) (“[c]omputer software is. . . speech that is protected at some level by the First Amendment”). However, not all code rises to the level of protected speech under the First Amendment. Corley, 273 F.3d at 449. Rather, there are “two ways in which a programmer might be said to communicate through code: to the user of the program (not necessarily protected) and to the computer (never protected).” Id. Further, even where code communicates to the user of a program, it still may not constitute protected speech under the First Amendment if it “commands `mechanically’ and `without the intercession of the mind or the will of the recipient,'” Id. (describing the holding of Commodity Futures Trading Comm’n v. Vartuli, 228 F.3d 94 (2d Cir. 2000)).

(emphasis added)

But there is a third way that programmers communicate through code, even more important (for First Amendment protection) than those two ways; and if Judge Snow had read two more sentences in the Corley opinion that he cites, he would have found it: the “third manner in which a programmer might communicate through code: to another programmer.” Id. Specifically,

Instructions such as computer code, which are intended to be executable by a computer, will often convey information capable of comprehension and assessment by a human being. A programmer reading a program learns information about instructing a computer, and might use this information to improve personal programming skills and perhaps the craft of programming. Moreover, programmers communicating ideas to one another almost inevitably communicate in code, much as musicians use notes. Limiting First Amendment protection of programmers to descriptions of computer code (but not the code itself) would impede discourse among computer scholars, just as limiting protection for musicians to descriptions of musical scores (but not sequences of notes) would impede their exchange of ideas and expression. Instructions that communicate information comprehensible to a human qualify as speech whether the instructions are designed for execution by a computer or a human (or both).

Corley, Id., at 448.

Indeed, when I teach software engineering to undergraduates, I make this very important point: your computer programs do not only execute on a computer, they must be readable and understandable to humans. A successful program will endure, and must be maintained by people who, perforce, will need to understand it. Elements of Programming Style are just as essential in coding as the Elements of Style are in other writing.

One cannot blame Judge Snow: his ruling is a very brief Order denying a motion to dismiss. And perhaps the Plaintiffs’ brief missed this point as well. Still the law (the Corley precedent) is clear: There are at least three ways that source code communicates, and one of those ways is that people can read it and learn how it works.

Regulation of Dealer-Management Systems

Regarding the underlying case, CDK Global v. Mark Brnovich et al. and Arizona Automobile Dealers Association, it is not clear whether the First Amendment claims will outweigh the interests of the state in regulating commerce. The Court’s ruling summarizes the case,

Plaintiffs CDK Global LLC … develop, own, and operate proprietary computer systems known as dealer management systems (“DMSs”) that process vast amounts of data sourced from various parties. Automotive dealerships hold licenses to DMSs to help manage their business operations, including handling confidential consumer and proprietary data, processing transactions, and managing data communications between dealers, customers, car manufacturers, credit bureaus, and other third parties. Plaintiffs employ multiple technological measures—such as secure login credentials, CAPTCHA prompts, and comprehensive cybersecurity infrastructure, hardware, and software—to safeguard their DMS systems from unauthorized access or breach. Plaintiffs also contractually prohibit dealers from granting third parties access to their DMSs without Plaintiffs’ authorization.

In March 2019, the Arizona Legislature passed the Dealer Data Security Law (“the Dealer Law”). The Dealer Law regulates the relationship between DMS licensers like Plaintiffs and the dealerships they serve. Under the Dealer Law, DMS providers may no longer “[p]rohibit[] a third party [that has been authorized by the Dealer and] that has satisfied or is compliant with. . . current, applicable security standards published by the standards for technology in automotive retail [(STAR standards)]. . . from integrating into the dealer’s [DMS] or plac[e] an unreasonable restriction on integration. . . .” The Dealer Law also requires that DMS providers “[a]dopt and make available a standardized framework for the exchange, integration and sharing of data from [a DMS]” that is compatible with STAR standards and that they “[p]rovide access to open application programming interfaces to authorized integrators.” Finally, a DMS provider may only use data to the extent permitted in the DMS provider’s agreement with the dealer, must permit dealer termination of such agreement, and “must work to ensure a secure transition of all protected dealer data to a successor dealer data vendor or authorized integrator” upon termination. Ariz. Rev. Stat. Ann. §§ 28-4654(B)(1)-(3).

(internal citations omitted)

Plaintiffs argue that Arizona’s requirement to modify their software to permit interoperability is “compelled speech” (in the form of computer code that they must write), which is a violation of the First Amendment.

I am a bit skeptical of the Plaintiffs’ argument, because this case is not really about communication, it’s about operation. That is, in the CDK Global case, we need not analyze whether prior restraints on distribution of software would violate the First Amendment, because it’s not really about distribution of software. It’s about the execution of software by car dealers. Arizona is really saying, “if a car dealer uses DMS software in the course of selling cars, then the DMS software must interoperate in certain ways.” Operation of a computer program is not necessarily protected by the First Amendment, even if communication of a computer program to another person might be.

Gun plans as 3-d printer files.

On the other hand, there are parallels between the Corley case (which was about restrictions on the distribution of software that would defeat copy-protection) and restrictions on the distribution of 3-d-printing files that would produce gun parts. Professor Eugene Volokh has analyzed “Three ways of thinking about [restrictions on distributing software in a First-Amendment context]: 1. Software is like hardware. 2. Software is like instruction manuals. 3. Alexa, read this book and make me a gun.”

Again, in this case, if a state regulates the operation of a 3-d printer (forbidding the production of gun parts) this may not conflict with the First Amendment (although it certainly relates to the Second Amendment); but regulating the communication of 3-d-printer files for gun parts may have First-Amendment implications.

Caution: I am not a lawyer. No warrantee is implied on these legal opinions.

Did Sean Hannity misquote me?

Mostly, I was quoted accurately, although the segment confuses a few different Dominion voting systems with each other. And vulnerabilities are not the same as rigged elections, especially when we have paper ballots in almost all the states.

On November 13, 2020, Fox News aired a segment by Sean Hannity, “A deep dive into the voting machines at center of controversy“, in which he pointed out problems with Dominion voting machines in Michigan and Georgia. He quoted from my 2018 Freedom-to-Tinker article Design flaw in Dominion ImageCast Evolution voting machine and from my 2018 testimony before the House Subcommitee on Information Technology.

The quotes are accurate, although slightly out of context. The Dominion systems in Michigan and Georgia are not the ImageCast Evolution that has that design flaw. My Congressional testimony is that all voting machines can be hacked, and that’s true. My testimony about replacing the software in 7 minutes with a screwdriver refers to an older Dominion voting machine, used in New Jersey (though not this year because of the pandemic), but not used in Michigan and Georgia. But it’s still true that, one way or another, the software in any voting machine can be (fraudulently) replaced — in any voting machine used in any of the 50 states.

Regarding Antrim County, Michigan: Dominion’s election-management software is badly designed: when uploading results from a voting machine to the central server, the software keeps track of votes by ballot position, with no check on candidate name. So if there’s a last-minute revision to the ballot design used in the voting machine, but the ballot-design file on the server is not updated, then votes for Trump may be mistakenly uploaded as votes for Biden. Dominion calls that “human error.” I call it, bad software design that fails to make consistency checks on its input. Fortunately, Antrim County has hand-marked paper ballots (counted by those Dominion optical-scan voting machines) that can be audited by hand, and other forms of paper trail, so Antrim County was able to correct its error and report accurate vote totals.

Mr. Hannity proposes a solution: “If we want to have as a country, election results with integrity, that the people of this country will have confidence in, we can easily and absolutely have a system forensically checked–and by the way, I’ll even argue, allowing both Republican and Democratic engineers to do the forensic check together.”

That’s a well-intentioned idea, but it does not really solve the problem. Yes, absolutely the source code and software of voting machines should be made public so that citizens of any party can examine it for design mistakes. But what happens if the voting machine is hacked after that examination?

The U.S. mostly uses paper ballots now, and that’s how we can trust the election results even though there are some computer vulnerabilities.

The best solution is to use paper ballots, marked by hand, counted by computers, and recountable by hand. Those computers might be hacked, but the ballots personally marked by the voters are the same pieces of paper that can be recounted by humans. That’s what Michigan does, along with more than 40 other states. That is the state-of-the-art most-secure-known way of conducting elections.

Georgia, on the other hand, uses touch-screen ballot-marking devices to mark the ballots, which are then counted by optical scanners and recountable by hand. If the optical scanners are hacked, then a recount will detect and correct the problem. But if the touch-screens are hacked, then (on a small fraction of the ballots) they can print the wrong vote on to the ballot. The recount can’t detect and correct that hack, because it can only see what’s printed on the ballots. Still, hacks and glitches in the election-management computers, in the optical scanners, and in other parts of the system have been detected and corrected by audits and examination of those paper ballots.