October 13, 2024

FBI's Spyware Program

Note: I worked for the Department of Justice’s Computer Crime and Intellectual Property Section (CCIPS) from 2001 to 2005. The documents discussed below mention a memo written by somebody at CCIPS during the time I worked there, but absolutely everything I say below reflects only my personal thoughts and impressions about the documents released to the public today.

Two years ago, Kevin Poulsen broke the news that the FBI had successfully deployed spyware to help catch a student sending death threats to his high school. The FBI calls the tool a CIPAV for “computer and internet protocol address verifier.”

We learned today that Kevin filed a Freedom of Information Act request (along with EFF and CNet News) asking for other information about CIPAVs. The FBI has responded, Kevin made the 152 pages available, and I just spent the past half hour skimming them.

Here are some unorganized impressions:

  • The 152 pages don’t take long to read, because they have been so heavily redacted. The vast majority of the pages have no substantive content at all.
  • Page one may be the most interesting page. Someone at CCIPS, my old unit, cautions that “While the technique is of indisputable value in certain kinds of cases, we are seeing indications that it is being used needlessly by some agencies, unnecessarily raising difficult legal questions (and a risk of suppression) without any countervailing benefit,”
  • On page 152, the FBI’s Cryptographic and Electronic Analysis Unit (CEAU) “advised Pittsburgh that they could assist with a wireless hack to obtain a file tree, but not the hard drive content.” This is fascinating on several levels. First, what wireless hack? The spyware techniques described in Poulsen’s reporting are deployed when a target is unlocatable, and the FBI tricks him or her into clicking a link. How does wireless enter the picture? Don’t you need to be physically proximate to your target to hack them wirelessly? Second, why could CEAU “assist . . . to obtain a file tree, but not the hard drive content.” That smells like a legal constraint, not a technical one. Maybe some lawyer was making distinctions based on probable cause?
  • On page 86, the page summarizing the FBI’s Special Technologies and Applications Office (STAO) response to the FOIA request, STAO responds that they have included an “electronic copy of ‘Magic Quadrant for Information Access Technology'” on cd-rom. Is that referring to this Gartner publication, and if so, what does this have to do with the FOIA request? I’m hoping one of the uber geeks reading this blog can tie FBI spyware to this phrase.
  • Pages 64-80 contain the affidavit written to justify the use of the CIPAV in the high school threat case. I had seen these back when Kevin first wrote about them, but if you haven’t seen them yet, you should read them.
  • It definitely appears that the FBI is obtaining search warrants before installing CIPAVs. Although this is probably enough to justify grabbing IP addresses and information packed in a Windows registry, it probably is not enough alone to justify tracing IP addresses in real time. The FBI probably needs a pen register/trap and trace order in addition to the warrant to do that under 18 U.S.C. 3123. Although pen registers are mentioned a few times in these documents–particularly in the affidavit mentioned above–many of the documents simply say “warrant.” This is probably not of great consequence, because if FBI has probable cause to deploy one of these, they can almost certainly justify a pen register order, but why are they being so sloppy?

Two final notes: First, I twittered my present sense impressions while reading the documents, which was an interesting experiment for me, if not for those following me. If you want to follow me, visit my profile.

Second, if you see anything else in the documents that bear scrutiny, please leave them in the comments of this post.

On open source vs. disclosed source voting systems

Sometimes, working on voting seems like running on a treadmill. Old disagreements need to be argued again and again. As long as I’ve been speaking in public about voting, I’ve discussed the need for voting systems’ source code to be published, as in a book, to create transparency into how the systems operate. Or, put another way, trade secrecy is anathema to election transparency. We, the people, have an expectation that our election policies and procedures are open to scrutiny, and that critical scrutiny is essential to the exercise of our Democracy. (Cue the waving flags.)

On Tuesday, the Election Technology Council (a trade association of four major American voting system manufacturers) put out a white paper on open-source and voting systems. It’s nice to see them finally talking about the issue, but there’s a distinctive cluelessness in this paper about what, exactly, open source is and what it means for a system to be secure. For example, in a sidebar titled “Disclosed vs. Open: Clarifying Misconceptions”, the report states:

… taking a software product that was once proprietary and disclosing its full source code to the general public will result in a complete forfeiture of the software’s security … Although computer scientists chafe at the thought of “security through obscurity,” there remains some underlying truths to the idea that software does maintain a level of security through the lack of available public knowledge of the inner workings of a software program.

Really? No. Disclosing the source code only results in a complete forfeiture of the software’s security if there was never any security there in the first place. If the product is well-engineered, then disclosing the software will cause no additional security problems. If the product is poorly-engineered, then the lack of disclosure only serves the purpose of delaying the inevitable.

What we learned from the California Top-to-Bottom Review and the Ohio EVEREST study was that, indeed, these systems are unquestionably and unconscionably insecure. The authors of those reports (including yours truly) read the source code, which certainly made it easier to identify just how bad these systems were, but it’s fallacious to assume that a prospective attacker, lacking the source code and even lacking our reports, is somehow any less able to identify and exploit the flaws. The wide diversity of security flaws exploited on a regular basis in Microsoft Windows completely undercuts the ETC paper’s argument. The bad guys who build these attacks have no access to Windows’s source code, but they don’t need it. With common debugging tools (as well as customized attacking tools), they can tease apart the operation of the compiled, executable binary applications and engineer all sorts of malware.

Voting systems, in this regard, are just like Microsoft Windows. We have to assume, since voting machines are widely dispersed around the country, that attackers will have the opportunity to tear them apart and extract the machine code. Therefore, it’s fair to argue that source disclosure, or the lack thereof, has no meaningful impact on the operational security of our electronic voting machines. They’re broken. They need to be repaired.

The ETC paper also seems to confuse disclosed source (published, as in a book) with open source (e.g., under a GPL or BSD license). For years, I’ve been suggesting that the former would be a good thing, and I haven’t taken a strong position on the latter. Even further, the ETC paper seems to assume that open source projects are necessarily driven by volunteer labor, rather than by companies. See, for example:

… if proprietary software is ripped open through legislative fiat, whatever security features exist are completely lost until such time that the process improvement model envisioned by the open source community has an opportunity to take place (Hall 2007).

There are plenty of open-source projects that are centrally maintained by commercial companies with standard, commercial development processes. There’s no intrinsic reason that software source code disclosure or open licensing makes any requirements on the development model. And, just because software is suddenly open, there’s indeed no guarantee that a community of developers will magically appear and start making improvements. Successful open source communities arise when distinct developers or organizations share a common need.

Before I go on, I’ll note that the ETC report has cherry-picked citations to support its cause, and those citations are neither being used honestly nor completely. The above citation to Joe Hall’s 2007 EVT paper distorts Hall’s opinions. His actual paper, which surveys 55 contracts between voting system manufacturers and the jurisdictions that buy them, makes very specific recommendations, including that these contracts should allow for source code review in the pre-election, post-election, and litigation stages of the election cycle. Hall is arguing in favor of source code disclosure, yet the citation to his paper would seem to have him arguing against it!

So, how would open source (or disclosed source) work in the commercial voting machine industry? The ETC paper suggests that it might be difficult to get an open-source project off the ground with distributed development by volunteers. This is perfectly believable. Consequently, that’s not how it would ever work. As I said above, I’ve always advocated for disclosure, but let’s think through how a genuine open-source voting machine might succeed. A likely model is that a state, coalition of states, or even the Federal government would need to step in to fund the development of these systems. The development organization would most likely be a non-profit company, along the lines of the many Federally Funded Research and Development Centers (FFRDCs) already in existence. Our new voting FFRDC, presumably sponsored by the EAC, would develop the source code and get it certified. It would also standardize the hardware interface, allowing multiple vendors to build compatible hardware. Because the code would be open, these hardware vendors would be free to make enhancements or to write device drivers, which would then go back to the FFRDC for integration and testing. (The voting FFRDC wouldn’t try to take the code to existing voting systems, so there’s no worry about stealing their IP. It’s easier to just start from scratch.) My hypothetical FFRDC model isn’t too far away from how Linux is managed, or even how Microsoft manages Windows, so this isn’t exactly science fiction.

The ETC paper asks who would develop new features as might be required by a customer and suggests that the “lack of a clear line of accountability for maintaining an open source project” would hinder this process. In my hypothetical FFRDC model, the customer could commission their own programmers to develop the enhancement and contribute this back to the FFRDC for integration. The customer could also directly commission the FFRDC or any other third-party to develop something that suits their needs. They could test it locally in mock elections, but ultimately their changes would need to pass muster with the FFRDC and the still-independent certification and testing authorities. (Or, the FFRDC could bring the testing/certification function in-house or NIST could be funded to do it. That’s a topic for another day.) And, of course, other countries would be free to adopt our hardware and customize our software for their own needs.

Unfortunately, such a FFRDC structure seems unlikely to occur in the immediate future. Who’s going to pay for it? Like it or not, we’re going to be stuck with the present voting system manufacturers and their poorly engineered products for a while. The challenge is to keep clarity on what’s necessary to improve their security engineering. By requiring source code disclosure, we improve election transparency, and we can keep pressure on the vendors to improve their systems. If the security flaws found two years ago in the California and Ohio studies haven’t been properly fixed by one vendor while another is making progress, that progress will be visible and we can recommend that the slow vendor be dropped.

A secondary challenge is to counter the sort of mischaracterizations that are still, sadly, endemic from the voting system industry. Consider this quote:

If policymakers attempt to strip the intellectual property from voting system software, it raises two important areas of concern. The first is the issue of property takings without due process and compensation which is prohibited under the United States Constitution. The second area of concern is one of security. The potential for future gains with software security will be lost in the short-term until such time that an adequate product improvement model is incorporated. Without a process improvement model in place, any security features present in current software would be lost. At the same time, the market incentives for operating and supporting voting products would be eliminated.

For starters, requiring the disclosure of source code does not represent any sort of “taking” of the code. Vendors would still own copyrights to their code. Furthermore, they may still avail themselves of the patent system to protect their intellectual property. Their only loss would be of the trade secret privilege.

And again, we’ve got the bogus security argument combined with some weird process improvement model business. Nobody says that disclosing your source requires you to change your process. Instead, the undisputed fact that these vendors’ systems are poorly engineered requires them to improve their processes (and what have they been doing for the past two years?), which would be necessary regardless of whether the source code is publicly disclosed.

Last but not least, it’s important to refute one more argument:

Public oversight is arguably just as diminished in an open source environment since the layperson is unable to read and understand software source code adequately enough to ensure total access and comprehension. … However, effective oversight does not need to be predicated on the removal of intellectual property protections. Providing global access to current proprietary software would undermine the principles of intellectual property and severely damage the viability of the current marketplace.

Nobody has ever suggested that election transparency requires the layperson to be able to understand the source code. Rather, it requires the layperson to be able to trust their newspaper, or political party, or Consumer Reports, or the League of Women Voters, to be able to retain their own experts and reach their own conclusions.

As to the “principles of intellectual property”, the ETC paper conflates and confuses copyright, patent, and trade secrets. Any sober analysis must consider these distinctly. As to the “viability of the current marketplace”, the market demands products that are meaningfully secure, usable, reliable, and affordable. So long as the present vendors fail on one or more of these counts, their markets will suffer.

Update: Gordon Haff chimes in at cnet, on how the ETC misconceptions about how open source development procedures work are far from atypical.

Thoughts on juries for intellectual property lawsuits

Here’s a thought that’s been stuck in my head for the past few days. It would never be practical, but it’s an interesting idea to ponder. David Robinson tells me I’m not the first one to have this idea, either, but anyway…

Consider what happens in intellectual property lawsuits, particularly concerning infringement of patents or misappropriation of trade secrets. Ultimately, a jury is being asked to rule on essential questions like whether a product meets all the limitations of a patent’s claims, or whether a given trade secret was already known to the public. How does the jury reach a verdict? They’re presented with evidence and with testimony from experts for the plaintiff and experts for the defendant. The jurors then have to sort out whose arguments they find most persuasive. (Of course, a juror who doesn’t follow the technical details could well favor an expert who they find more personable, or better able to handle the pressure of a hostile cross-examination.)

One key issue in many patent cases is the interpretation of particular words in the patent. If they’re interpreted narrowly, then the accused product doesn’t infringe, because it doesn’t have the specific required feature. Conversely, if the claims are interpreted broadly enough for the accused product to infringe the patent, then the prior art to the patent might also land within the broader scope of the claims, thus rendering the patent invalid as either anticipated by or rendered obvious by the prior art. Even though the court will construe the claims in its Markman ruling, there’s often still plenty of room for argument. How, then, does the jury sort out the breadth of the terms of a patent? Again, they watch dueling experts, dueling attorneys, and so forth, and then reach their own conclusions.

What’s missing from this game is a person having ordinary skill in the art at the time of the invention (PHOSITA). One of the jobs of an expert is to interpret the claims of a patent from the perspective of a PHOSITA. Our hypothetical PHOSITA’s perspective is also essential to understanding how obvious a patent’s invention is relative to the prior art. The problem I want to discuss today is that in most cases, nobody on the jury is a PHOSITA or anywhere close. What would happen if they were?

With a hypothetically jury of PHOSITAs, they would be better equipped to read the patent themselves and directly answer questions that are presently left for experts to argue. Does this patent actually enable a PHOSITA to build the gadget (i.e., to “practice the invention”)? Would the patent in question be obvious given a description of the prior art at the time? Or, say in a trade secret case, is the accused secret something that’s actually well-known? With a PHOSITA jury, they could reason about these questions from their own perspective. Imagine, in a software-related case, being able to put source code in front of a jury and have them be able to read it independently. This idea effectively rethinks the concept of a jury of one’s peers. What if juries on technical cases were “peers” with the technology that’s on trial? It would completely change the game.

This idea would never fly for a variety of reasons. First and foremost, good luck finding enough people with the right skill sets and lacking any conflict of interest. Even if our court system had enough data on the citizenry to be able to identify suitable jury candidates (oh, the privacy concerns!), some courts’ jurisdictions simply don’t have enough citizens with the necessary skills and lack of conflicts. What would you do? Move the lawsuit to a different jurisdiction? How many parts of the country have a critical mass of engineers/scientists with the necessary skills? Furthermore, a lot of the wrangling in a lawsuit boils down to controlling what information is and is not presented to the jury. If the jury shows up with their own knowledge, they may reach their own conclusions based on that knowledge, and that’s something that many lawyers and courts would find undesirable because they couldn’t control it.

Related discussion shows up in a recent blog post by Julian Sanchez and a followup by Eric Rescorla. Sanchez’s thesis is that it’s much easier to make a scientific argument that sounds plausible, while being completely bogus, than it is to refute such a argument, because the refutation could well require building up an explanation of the relevant scientific background. He’s talking about climate change scientists vs. deniers or about biologists refuting “intelligent design” advocates, but the core of the argument is perfectly applicable here. A PHOSITA jury would have a better chance of seeing through bogus arguments and consequently they would be more likely to reach a sound verdict.

Fascinating New Blog: ComputationalLegalStudies.com

I was inspired to post the essay I discussed in the prior post by the debut of the best new law blog I have seen in a long time, Computational Legal Studies, featuring the work of Daniel Katz and Michael Bommarito, both graduate students in the University of Michigan’s political science department.

Every single blog they have posted has caused me to smack my head once for not having thought of the idea first, and a second time for not having their datasets and skillz. Their visualization of who has gotten TARP funds and how they’re connected to legislators deserves to be printed on posters and hung up in newsrooms across the country (not to mention in offices on Capitol Hill). They’ve also shown good taste by building a bridge to this blog, linking favorably back to the great CITP work led by David Robinson on government openness.

I will have more to say about Dan and Mike’s new blog in the weeks and months to come, but for now it is enough to welcome them to the blogosphere.

Computer Programming and the Law: A New Research Agenda

By my best estimate, at least twenty different law professors on the tenure track at American law schools once held a job as a professional computer programmer. I am proud to say that two of us work at my law school.

Most of these hyphenate lawprof-coders rarely write any code today, and this is a shame. There are many good reasons why the world would be a better place if we began to integrate computer programming into legal scholarship (and more generally, into law and policy).

Two years ago, I wrote a blog post for a lawprof blog exploring this idea. I promised a follow-up post, but never delivered. A year later, I expanded the idea into an essay, which the good people at the Villanova Law Review agreed to publish sometime later this year. With this post, I am releasing a slightly-outdated draft of the essay for the first time to the public. You can download it at SSRN.

In the abstract, I say:

This essay proposes a new interdisciplinary research agenda called Computer Programming and the Law. By harnessing the power of computer programming, legal scholars can develop better tools, data, and insights for advancing their research interests. This essay presents the case for this new research agenda, highlights some examples of those who have begun to blaze the trail, and includes code samples to demonstrate the power and potential of developing software for legal scholarship. The code samples in this essay can be run like a piece of software—thanks to a technique known as literate programming—making this the world’s first law review article that is also a working computer program.

If you have any interest in the intersection of technology and policy (in other words, if you read this blog), please read the essay and let me know what you think. Unlike many law review articles, this one is short. And how bad could it be? It contains 350 lines of perl! (Wait, don’t answer that!)