September 26, 2022

New Study Analyzing Political Advertising on Facebook, Google, and TikTok

By Orestis Papakyriakopoulos, Christelle Tessono, Arvind Narayanan, Mihir Kshirsagar

With the 2022 midterm elections in the United States fast approaching, political campaigns are poised to spend heavily to influence prospective voters through digital advertising. Online platforms such as Facebook, Google, and TikTok will play an important role in distributing that content. But our new study – How Algorithms Shape the Distribution of Political Advertising: Case Studies of Facebook, Google, and TikTok — that will appear in the Artificial Intelligence, Ethics, and Society conference in August, shows that the platforms’ tools for voluntary disclosures about political ads do not provide the necessary transparency the public needs. More details can also be found on our website: campaigndisclosures.princeton.edu.

Our paper conducts the first large-scale analysis of public data from the 2020 presidential election cycle to critically evaluate how online platforms affect the distribution of political advertisements. We analyzed a dataset containing over 800,000 ads about the 2020 U.S. presidential election that ran in the 2 months prior to the election, which we obtained from the ad libraries of Facebook and Google that were created by the companies to offer more transparency about political ads. We also collected and analyzed 2.5 million TikTok videos from the same time period. These ad libraries were created by the platforms in an attempt to stave off potential regulation such as the Honest Ads Act, which sought to impose greater transparency requirements for platforms carrying political ads. But our study shows that these ad libraries fall woefully short of their own objectives to be more transparent about who pays for the ads and who sees the ads, as well the objectives of bringing greater transparency about the role of online platforms in shaping the distribution of political advertising. 

We developed a three-part evaluative framework to assess the platform disclosures: 

1. Do the disclosures meet the platforms’ self-described objective of making political advertisers accountable?

2. How do the platforms’ disclosures compare against what the law requires for radio and television broadcasters?

3. Do the platforms disclose all that they know about the ad targeting criteria, the audience for the ads, and how their algorithms distribute or moderate content?

Our analysis shows that the ad libraries do not meet any of the objectives. First, the ad libraries only have partial disclosures of audience characteristics and targeting parameters of placed political ads. But these disclosures do not allow us to understand how political advertisers reached prospective voters. For example, we compared ads in the ad libraries that were shown to different audiences with dummy ads that we created on the platforms (Figure 1). In many cases, we measured a significant difference between the calculated cost-per-impression between the two types of ads, which we could not explain with the available data.

  • Figure 1. We plot the generated cost per impression of ads in the ad-libraries that were (1) targeted to all genders & ages on Google, (2) to Females, between 25-34 on YouTube, (3) were seen by all genders & ages in the US on Facebook, and (4) only by females of all ages located in California on Facebook.  For Facebook, lower & upper bounds are provided for the impressions. For Google, lower & upper bounds are provided for cost & impressions, given the extensive “bucketing” of the parameters performed by the ad libraries when reporting them, which are denoted in the figures with boxes. Points represent the median value of the boxes. We compare the generated cost-per impression of ads with the cost-per impression of a set of dummy ads we placed on the platforms with the exact same targeting parameters & audience characteristics. Black lines represent the upper and lower boundaries of an ad’s cost-per-impression as we extracted them from the dummy ads. We label an ad placement as “plausible targeting”, when the ad cost-per-impression overlaps with the one we calculated, denoting that we can assume that the ad library provides all relevant targeting parameters/audience characteristics about an ad.  Similarly, an placement labeled as `”unexplainable targeting’”  represents an ad whose cost-per-impression is outside the upper and lower reach values that we calculated, meaning that potentially platforms do not disclose full information about the distribution of the ad.

Second, broadcasters are required to offer advertising space at the same price to political advertisers as they do to commercial advertisers. But we find that the platforms charged campaigns different prices for distributing ads. For example, on average, the Trump campaign on Facebook paid more per impression (~18 impressions/dollar) compared to the Biden campaign (~27 impressions/dollar). On Google, the Biden campaign paid more per impression compared to the Trump campaign. Unfortunately, while we attempted to control for factors that might account for different prices for different audiences, the data does not allow us to probe the precise reason for the differential pricing. 

Third, the platforms do not disclose the detailed information about the audience characteristics that they make available to advertisers. They also do not explain how the algorithms distribute or moderate the ads. For example, we see that campaigns placed ads on Facebook that were not ostensibly targeted by age, but the ad was not distributed uniformly.  We also find that platforms applied their ad moderation policies inconsistently, with some instances of moderated ads being removed and some others not, and without any explanation for the decision to remove an ad. (Figure 2) 

  • Figure 2. Comparison of different instances of moderated ads across platforms. The light blue bars show how many instances of a single ad were moderated, and maroon bars show how many instances of the same ad were not. Results suggests an inconsistent moderation of content across platforms, with some instances of the same ad being removed and some others not.

Finally, we observed new forms of political advertising that are not captured in the ad libraries. Specifically, campaigns appear to have used influencers to promote their messages without adequate disclosure. For example, on TikTok, we document how political influencers, who were often linked with PACs, generated billions of impressions from their political content. This new type of campaigning still remains unregulated and little is known about the practices and relations between influencers and political campaigns.  

In short, the online platform self-regulatory disclosures are inadequate and we need more comprehensive disclosures from platforms to understand their role in the political process. Our key recommendations include:

– Requiring that each political entity registered with the FEC use a single, universal identifier for campaign spending across platforms to allow the public to track their activity.

– Developing a cross-platform data repository, hosted and maintained by a government or independent entity, that collects political ads, their targeting criteria, and the audience characteristics that received them. 

– Requiring platforms to disclose information that will allow the public to understand how the algorithms distribute content and how platforms price the distribution of political ads. 

– Developing a comprehensive definition of political advertising that includes influencers and other forms of paid promotional activity.

Internet Voting Security: Wishful Thinking Doesn’t Make It True

[The following is a post written at my invitation by Professor Duncan Buell from the University of South Carolina. Curiously, the poll Professor Buell mentions below is no longer listed in the list of past & present polls on the Courier-Journal site, but is available if you kept the link.]

On Thursday, March 21, in the midst of Kentucky’s deliberation over allowing votes to be cast over the Internet, the daily poll of the Louisville Courier-Journal asked the readers, “Should overseas military personnel be allowed to vote via the Internet?” This happened the day before their editorial rightly argued against Internet voting at this time.

One of the multiple choice answers was “Yes, it can be made just as secure as any balloting system.” This brings up the old adage, “we are all entitled to our own opinions, but we are not entitled to our own facts.” The simple fact is that Internet voting is possible – but it is definitely NOT as secure as some other balloting systems. This is not a matter of opinion, but a matter of fact. Votes cast over the Internet are easily subject to corruption in a number of different ways.

To illustrate this point, two colleagues, both former students, wrote simple software scripts that allowed us to vote multiple times in the paper’s opinion poll. We could have done this with repeated mouse clicks on the website, but the scripts allowed us to do it automatically, and by night’s end we had voted 60,000 times. The poll vendor’s website claims that it blocks repeated voting, but that claim is clearly not entirely true. We did not break in to change the totals. We did not breach the security of the Courier-Journal’s computers. We simply used programs instead of mouse clicks to vote on the poll website itself.
[Read more…]

How much does a botnet cost, and the impact on internet voting

A brief article on how much botnets cost to rent (more detail here) shows differing prices depending on whether you want US machines, European machines, etc. Interestingly, the highest prices go to botnets composed of US machines, presumably because the owners of those machines have more purchasing power and hence stealing credentials from those machines is more valuable. Even so, the value of each machine is quite low – $1000 for 10,000 infected US machines vs. $200 for 10,000 random machines around the world. [Reminds me of my youth where stamp collectors could get packets of random canceled stamps at different prices for “world” vs. specific countries – and most of the stuff in the world packets was trash.]

So what does this have to do with voting? Well, at $1000 for 10,000 infected American machines, the cost is $0.10/machine, and less as the quantity goes up. If I can “buy” (i.e., steal) votes in an internet voting scheme for $0.10 each, that’s far cheaper than any form of advertising. In a hard-fought election I’ll get a dozen fliers for each candidate on the ballot, each of which probably costs close to $1 when considering printing, postage, etc. So stealing votes is arguably 100 times cheaper (assuming that a large fraction of the populace were to vote by internet), even when considering the cost of developing the software that runs in the botnet.

Granted, not every machine in a botnet would be used for voting, even under the assumption that everyone voted by internet. But even if only 10% of them are, the cost per vote is still very “reasonable” under this scenario.

And as John Sebes responded in an earlier draft of this posting:

“You compared digital vote stealing costs to the costs of mere persuasion. What about the costs of analog vote stealing? It’s all anecdotal of course but I do hear that the going rate is about $35 from an absentee vote fraudster to a voter willing to sell a pre-signed absentee ballot kit. Even if the bad guys have to spend 100 of those dimes to get a 1-in-a-hundred machine that’s used for i-voting, that $10 is pretty good because $10 is cheaper than $35 and it and saves the trouble of paying the gatherers who are at risk for a felony.”