August 8, 2022

The anomaly of cheap complexity

Why are our computer systems so complex and so insecure?  For years I’ve been trying to explain my understanding of this question. Here’s one explanation–which happens to be in the context of voting computers, but it’s a general phenomenon about all our computers:

There are many layers between the application software that implements an electoral function and the transistors inside the computers that ultimately carry out computations. These layers include the election application itself (e.g., for voter registration or vote tabulation); the user interface; the application runtime system; the operating system (e.g., Linux or Windows); the system bootloader (e.g., BIOS or UEFI); the microprocessor firmware (e.g., Intel Management Engine); disk drive firmware; system-on-chip firmware; and the microprocessor’s microcode. For this reason, it is difficult to know for certain whether a system has been compromised by malware. One might inspect the application-layer software and confirm that it is present on the system’s hard drive, but any one of the layers listed above, if hacked, may substitute a fraudulent application layer (e.g., vote-counting software) at the time that the application is supposed to run. As a result, there is no technical mechanism that can ensure that every layer in the system is unaltered and thus no technical mechanism that can ensure that a computer application will produce accurate results. 

[Securing the Vote, page 89-90]

So, computers are insecure because they have so many complex layers.

But that doesn’t explain why there are so many layers, and why those layers are so complex–even for what “should be a simple thing” like counting up votes.

Recently I came across a really good explanation: a keynote talk by Thomas Dullien entitled “Security, Moore’s law, and the anomaly of cheap complexity” at CyCon 2018, the 10th International Conference on Cyber Conflict, organized by NATO.

Thomas Dullien’s talk video is here, but if you want to just read the slides, they are here.

As Dullien explains,

A modern 2018-vintage CPU contains a thousand times more transistors than a 1989-vintage microprocessor.  Peripherals (GPUs, NICs, etc.) are objectively getting more complicated at a superlinear rate. In his experience as a cybersecurity expert, the only thing that ever yielded real security gains was controlling complexity.  His talk examines the relationship between complexity and failure of security, and discusses the underlying forces that drive both.

Transistors-per-chip is still increasing every year; there are 3 new CPUs per human per year.  Device manufacturers are now developing their software even before the new hardware is released.  Insecurity in computing is growing faster than security is improving.

The anomaly of cheap complexity.  For most of human history, a more complex device was more expensive to build than a simpler device.  This is not the case in modern computing. It is often more cost-effective to take a very complicated device, and make it simulate simplicity, than to make a simpler device.  This is because of economies of scale: complex general-purpose CPUs are cheap.  On the other hand, custom-designed, simpler, application-specific devices, which could in principle be much more secure, are very expensive.  

This is driven by two fundamental principles in computing: Universal computation, meaning that any computer can simulate any other; and Moore’s law, predicting that each year the number of transistors on a chip will grow exponentially.  ARM Cortex-M0 CPUs cost pennies, though they are more powerful than some supercomputers of the 20th century.

The same is true in the software layers.  A (huge and complex) general-purpose operating system is free, but a simpler, custom-designed, perhaps more secure OS would be very expensive to build.  Or as Dullien asks, “How did this research code someone wrote in two weeks 20 years ago end up in a billion devices?”

Then he discusses hardware supply-chain issues: “Do I have to trust my CPU vendor?”  He discusses remote-management infrastructures (such as the “Intel Management Engine” referred to above):  “In the real world, ‘possession’ usually implies ‘control’. In IT, ‘possession’ and ‘control’ are decoupled. Can I establish with certainty who is in control of a given device?”

He says, “Single bitflips can make a machine spin out of control, and the attacker can carefully control the escalating error to his advantage.”  (Indeed, I’ve studied that issue myself!)

Dullien quotes the science-fiction author Robert A. Heinlein:

“How does one design an electric motor? Would you attach a bathtub to it, simply because one was available? Would a bouquet of flowers help? A heap of rocks? No, you would use just those elements necessary to its purpose and make it no larger than needed — and you would incorporate safety factors. Function controls design.” 

 Heinlein, The Moon Is A Harsh Mistress

and adds, “Software makes adding bathtubs, bouquets of flowers, and rocks, almost free. So that’s what we get.”

Dullien concludes his talk by saying, “When I showed the first [draft of this talk] to some coworkers they said, ‘you really need to end on a more optimistic note.”  So Dullien gives optimism a try, discussing possible advances in cybersecurity research; but still he gives us only a 10% chance that society can get this right.


Postscript:  Voting machines are computers of this kind.  Does their inherent insecurity mean that we cannot use them for counting votes?  No. The consensus of election-security experts, as presented in the National Academies study, is: we should use optical-scan voting machines to count paper ballots, because those computers, when they are not hacked, are much more accurate than humans.  But we must protect against bugs, against misconfigurations, against hacking, by always performing risk-limiting audits, by hand, of an appropriate sample of the paper ballots that the voters marked themselves.

Magical thinking about Ballot-Marking-Device contingency plans

The Center for Democracy and Technology recently published a report, “No Simple Answers: A Primer on Ballot Marking Device Security”, by William T. Adler.   Overall, it’s well-informed, clearly presents the problems as of 2022, and it’s definitely worth reading.  After explaining the issues and controversies, the report presents recommendations, most of which make a lot of sense, and indeed the states should act upon them.  But there’s one key recommendation in which Dr. Adler tries to provide a simple answer, and unfortunately his answer invokes a bit of magical thinking.  This seriously compromises the conclusions of his report.  By asking but not answering the question of “what should an election official do if there are reports of BMDs printing wrong votes?”, Dr. Adler avoids having to make the inevitable conclusion that BMDs-for-all-voters is a hopelessly flawed, insecurable method of voting.  Because the answer to that question is, unfortunately, there’s nothing that election officials could usefully do in that case.

BMDs (ballot marking devices) are used now in several states and there is a serious problem with them (as the report explains): “a hacked BMD could corrupt voter selections systematically, such that a candidate favored by the hacker is more likely to win.”  That is, if a state’s BMDs are hacked by someone who wants to change the result of an election, the BMDs can print ballots with votes on them different from what the voters indicated on the touchscreen.  Because most voters won’t inspect the ballot paper carefully enough before casting their ballot, most voters won’t notice that their vote has been changed.  The voters who do notice are (generally) allowed to “spoil” their ballot and cast a new one; but the substantial majority of voters, those who don’t check their ballot paper carefully, are vulnerable to having their votes stolen.

One simple answer is not to use BMDs at all: let voters mark their optical-scan paper ballots with a pen (that is, HMPB: hand-marked paper ballots).  A problem with this simple answer (as the report explains) is that some voters with disabilities cannot mark a paper ballot with a pen.  And (as the report explains) if BMDs are reserved just for the use of voters with disabilities, then those BMDs become “second class”: pollworkers are unfamiliar with how to set them up, rarely used machines may not work in the polling place when turned on, paper ballots cast by the disabled are distinguishable from those filled in with a pen, and so on.

So Dr. Adler seems to accept that BMDs, with their serious vulnerabilities, are inevitably going to be adopted—and so he makes recommendations to mitigate their insecurities.  And most of his recommendations are spot-on:  incorporate the cybersecurity measures required by the VVSG 2.0, avoid the use of bar codes and QR codes, adopt risk-limiting audits (RLAs).  Definitely worth doing those things, if election officials insist on adopting this seriously flawed technology in the first place.

But then he makes a recommendation intended to address the problem that if the BMD is cheating then it can print fraudulent votes that will survive any recount or audit.  The report recommends,

Another way is to depend on voter reports. In an election with compromised BMDs modifying votes in a way visible to voters who actively verify and observe those modifications, it is likely that election officials would receive an elevated number of reported errors. In order to notice a widespread issue, election officials must be monitoring election errors in real-time across a county or state. If serious problems are revealed with the BMDs that cast doubt on whether votes were recorded properly, either via parallel testing or from voter reports, election officials must respond. Accordingly, election officials should have a contingency plan in the event that BMDs appear to be having widespread issues. Such a plan would include, for instance, having the ability to substitute paper ballots for BMDs, decommissioning suspicious BMDs, and investigating whether other machines are also misbehaving. Stark (2019) has warned, however, that because it is likely not possible to know how many or which ballots were affected, the only remedy to this situation may be to hold a new election.

This the magical thinking:  “election officials should have a contingency plan.”  The problem is, when you try to write down such a plan, there’s nothing that actually works!  Suppose the election officials rely on voter reports (or on the rate of spoiled ballots); suppose the “contingency plan” says (for example) says “if x percent of the voters report malfunctioning BMDs, or y percent of voters spoil their ballots, then we will . . .”   Then we will what?  Remove those BMDs from service in the middle of the day?  But then all the votes already cast on those BMDs will have been affected by the hack; that could be thousands of votes.  Or what else?  Discard all the paper ballots that were cast on those BMDs?  Clearly you can’t do that without holding an entirely new election.  And what if those x% or y% of voters were fraudulently reporting BMD malfunction or fraudulently spoiling their ballots to trigger the contingency plan?  There’s no plan that actually works.

Everything I’ve explained here was already written down in “Ballot-marking devices cannot ensure the will of the voters” (2020 [non-paywall version]) and in “There is no reliable way to detect hacked ballot-marking devices” (2019), both of which Dr. Adler cites.  But an important purpose of magical thinking is to avoid facing difficult facts.

It’s like saying, “to prevent climate change we should just use machines to pull 40 billion tons of CO2 out of the atmosphere each year.”  But there is no known technology that can do this.  All the direct-air-capture facilities deployed to date can capture just 0.00001 billion tons.  Just because we really, really want something to work is not enough.

There is an inherent problem with BMDs: they can change votes in a way that will survive any audit or recount.  Not only is there “no simple solution” to this problem, there’s no solution period.  Perhaps someday a solution will be identified.  Until then, BMDs-for-all-voters is dangerous, even with all known mitigations.

New Study Analyzing Political Advertising on Facebook, Google, and TikTok

By Orestis Papakyriakopoulos, Christelle Tessono, Arvind Narayanan, Mihir Kshirsagar

With the 2022 midterm elections in the United States fast approaching, political campaigns are poised to spend heavily to influence prospective voters through digital advertising. Online platforms such as Facebook, Google, and TikTok will play an important role in distributing that content. But our new study – How Algorithms Shape the Distribution of Political Advertising: Case Studies of Facebook, Google, and TikTok — that will appear in the Artificial Intelligence, Ethics, and Society conference in August, shows that the platforms’ tools for voluntary disclosures about political ads do not provide the necessary transparency the public needs. More details can also be found on our website: campaigndisclosures.princeton.edu.

Our paper conducts the first large-scale analysis of public data from the 2020 presidential election cycle to critically evaluate how online platforms affect the distribution of political advertisements. We analyzed a dataset containing over 800,000 ads about the 2020 U.S. presidential election that ran in the 2 months prior to the election, which we obtained from the ad libraries of Facebook and Google that were created by the companies to offer more transparency about political ads. We also collected and analyzed 2.5 million TikTok videos from the same time period. These ad libraries were created by the platforms in an attempt to stave off potential regulation such as the Honest Ads Act, which sought to impose greater transparency requirements for platforms carrying political ads. But our study shows that these ad libraries fall woefully short of their own objectives to be more transparent about who pays for the ads and who sees the ads, as well the objectives of bringing greater transparency about the role of online platforms in shaping the distribution of political advertising. 

We developed a three-part evaluative framework to assess the platform disclosures: 

1. Do the disclosures meet the platforms’ self-described objective of making political advertisers accountable?

2. How do the platforms’ disclosures compare against what the law requires for radio and television broadcasters?

3. Do the platforms disclose all that they know about the ad targeting criteria, the audience for the ads, and how their algorithms distribute or moderate content?

Our analysis shows that the ad libraries do not meet any of the objectives. First, the ad libraries only have partial disclosures of audience characteristics and targeting parameters of placed political ads. But these disclosures do not allow us to understand how political advertisers reached prospective voters. For example, we compared ads in the ad libraries that were shown to different audiences with dummy ads that we created on the platforms (Figure 1). In many cases, we measured a significant difference between the calculated cost-per-impression between the two types of ads, which we could not explain with the available data.

  • Figure 1. We plot the generated cost per impression of ads in the ad-libraries that were (1) targeted to all genders & ages on Google, (2) to Females, between 25-34 on YouTube, (3) were seen by all genders & ages in the US on Facebook, and (4) only by females of all ages located in California on Facebook.  For Facebook, lower & upper bounds are provided for the impressions. For Google, lower & upper bounds are provided for cost & impressions, given the extensive “bucketing” of the parameters performed by the ad libraries when reporting them, which are denoted in the figures with boxes. Points represent the median value of the boxes. We compare the generated cost-per impression of ads with the cost-per impression of a set of dummy ads we placed on the platforms with the exact same targeting parameters & audience characteristics. Black lines represent the upper and lower boundaries of an ad’s cost-per-impression as we extracted them from the dummy ads. We label an ad placement as “plausible targeting”, when the ad cost-per-impression overlaps with the one we calculated, denoting that we can assume that the ad library provides all relevant targeting parameters/audience characteristics about an ad.  Similarly, an placement labeled as `”unexplainable targeting’”  represents an ad whose cost-per-impression is outside the upper and lower reach values that we calculated, meaning that potentially platforms do not disclose full information about the distribution of the ad.

Second, broadcasters are required to offer advertising space at the same price to political advertisers as they do to commercial advertisers. But we find that the platforms charged campaigns different prices for distributing ads. For example, on average, the Trump campaign on Facebook paid more per impression (~18 impressions/dollar) compared to the Biden campaign (~27 impressions/dollar). On Google, the Biden campaign paid more per impression compared to the Trump campaign. Unfortunately, while we attempted to control for factors that might account for different prices for different audiences, the data does not allow us to probe the precise reason for the differential pricing. 

Third, the platforms do not disclose the detailed information about the audience characteristics that they make available to advertisers. They also do not explain how the algorithms distribute or moderate the ads. For example, we see that campaigns placed ads on Facebook that were not ostensibly targeted by age, but the ad was not distributed uniformly.  We also find that platforms applied their ad moderation policies inconsistently, with some instances of moderated ads being removed and some others not, and without any explanation for the decision to remove an ad. (Figure 2) 

  • Figure 2. Comparison of different instances of moderated ads across platforms. The light blue bars show how many instances of a single ad were moderated, and maroon bars show how many instances of the same ad were not. Results suggests an inconsistent moderation of content across platforms, with some instances of the same ad being removed and some others not.

Finally, we observed new forms of political advertising that are not captured in the ad libraries. Specifically, campaigns appear to have used influencers to promote their messages without adequate disclosure. For example, on TikTok, we document how political influencers, who were often linked with PACs, generated billions of impressions from their political content. This new type of campaigning still remains unregulated and little is known about the practices and relations between influencers and political campaigns.  

In short, the online platform self-regulatory disclosures are inadequate and we need more comprehensive disclosures from platforms to understand their role in the political process. Our key recommendations include:

– Requiring that each political entity registered with the FEC use a single, universal identifier for campaign spending across platforms to allow the public to track their activity.

– Developing a cross-platform data repository, hosted and maintained by a government or independent entity, that collects political ads, their targeting criteria, and the audience characteristics that received them. 

– Requiring platforms to disclose information that will allow the public to understand how the algorithms distribute content and how platforms price the distribution of political ads. 

– Developing a comprehensive definition of political advertising that includes influencers and other forms of paid promotional activity.