September 22, 2021

Studying the societal impact of recommender systems using simulation

By Eli Lucherini, Matthew Sun, Amy Winecoff, and Arvind Narayanan.

For those interested in the impact of recommender systems on society, we are happy to share several new pieces:

  • a software tool for studying this interface via simulation
  • the accompanying paper
  • a short piece on methodological concerns in simulation research
  • a talk offering a critical take on research on filter bubbles.

We elaborate below.

Simulation is a valuable way to study the societal impact of recommender systems.

Recommender systems in social media platforms such as Facebook and Twitter have been criticized due to the risks they might pose to society, such as amplifying misinformation or creating filter bubbles. But there isn’t yet consensus on the scope of these concerns, the underlying factors, or ways to remedy them. Because these phenomena arise through repeated system interactions over time, methods that assess the system at a single time point provide minimal insight into the mechanisms behind them. In contrast, simulations can model how users, items, and algorithms interact over arbitrarily long timescales. As a result, simulation has proved to be a valuable tool in assessing the impact of recommendation systems on the content users consume and on society.

This is a burgeoning area of research. We identified over a dozen studies that use simulation to study questions such as filter bubbles and misinformation. As an example of a study we admire, Chaney et al. illustrate the detrimental effects of algorithmic confounding, which occurs when a recommendation algorithm is trained on user interaction data that is itself influenced by the prior recommendations of the algorithm. Like all simulation research, this is a statement about a model and not a real platform. But the benefit is that it helps isolate the variables of interest so that relationships between them can be probed deeply in a way that improves our scientific understanding of these systems.

T-RECS: A new tool for simulating recommender systems

So far, most simulation studies of algorithmic systems have relied upon ad-hoc code implemented from scratch, which is time consuming, raises the likelihood of bugs, and limits reproducibility. We present T-RECS (Tools for RECommender system Simulation), an open-source simulation tool designed to enable investigations of emerging complex phenomena caused by millions of individual actions and interactions in algorithmic systems including filter bubbles, political polarization, and (mis)information diffusion. In the accompanying paper, we describe its design in detail and present two case studies.

T-RECS is flexible and can simulate just about any system in which “users” interact with “items” mediated by an algorithm. This is broader than just recommender systems: for example, we used T-RECS to reproduce a study on the virality of online content. T-RECS also supports two-sided platforms, i.e., those that include both users and content creators. The system is not limited to social media either: it can also be used to study music recommender systems or e-commerce platforms. With T-RECS, researchers with expertise in social science but limited engineering expertise can still leverage simulation to answer important questions about the societal effects of algorithmic systems.

What’s wrong with current recsys simulation research?

In a companion paper to T-RECS, we offer a methodological critique of current recommender systems simulation research. First, we observe that each paper tends to operationalize constructs such as polarization in subtly different ways. Despite seemingly minor differences, the effects may be vastly different, making comparisons between papers infeasible. We acknowledge that this is natural in the early stages of a discipline and is not necessarily a crisis by itself. Unfortunately, we also observe low transparency: papers do not specify their constructs in enough detail to allow others to reproduce and build on them, and practices such as sharing code and data are not yet the norm in this community.

We advocate for the adoption of software tools such as T-RECS that would help address both issues. Researchers would be able to draw upon a standard library of models and constructs. Further, they would be easily able to share reproduction materials as notebooks, containing code, data, results, and documentation packaged together.

Why do we need simulation, again?

Given that it is tricky to do simulation correctly and even harder to do it in a way that allows us to draw meaningful conclusions that apply to the real world, one may wonder why we need simulation for understanding the societal impacts of recommender systems at all. Why not stick with auditing or observational studies of real platforms? A notable example of such a study is “Exposure to ideologically diverse news and opinion on Facebook” by Bakshy et al. The study found that while Facebook’s users primarily consume ideologically-aligned content, the role of Facebook’s news feed algorithm is minimal compared to users’ own choices.

In a recent talk, one of us (Narayanan) discussed the limitations of quantitative studies of real platforms, focusing on the question of filter bubbles. The argument is this: the question of interest is causal in nature, but we can’t answer causal questions because the entire system evolves as one unit over a long period of time. Faced with this inherent limitation, studies such as the Facebook study above inevitably study very narrow versions of the question, focusing on a snapshot in time and ignoring feedback loops and other complications. Thus, while there is nothing wrong with these studies, they tell us little about the questions we really care about, and yet are widely misinterpreted to mean more than they do.

In conclusion, every available method for studying the societal impact of recommender systems has severe limitations. Yet this is an urgent question with enormous consequences; the study of these questions has been called a crisis discipline. We need every tool in the toolbox, even if none is perfect for the job. We need auditing and observational studies; we need qualitative studies; and we need simulation. Through T-RECS and its accompanying papers, we hope to both systematize research in this area and provide foundational infrastructure.