Several recent news stories have highlighted the ways that online social platforms can subtly shape our lives. First came the news that Facebook has “manipulated” users’ emotions by tweaking the balance of happy and sad posts that it shows to some users. Then, this week, the popular online dating service OKCupid announced that it had deliberately sent its users on dates that it predicted would not go well. OKCupid asks users questions, and matches them up based on their answers (for example, “do you like horror movies?”), using the answers to compute a “match percentage” showing how likely two people are to get along.
In the test that it disclosed this week, OKCupid explained, it displayed false match percentages for a few users — telling them that they were much better matched, or much worse matched, than the numbers actually predicted — in order to test how well the matching system predicted actual compatibility. Once the test was over, OKCupid informed users that their match percentages had been temporarily distorted.
Many observers have objected to OKCupid’s “lying” to its users and to Facebook’s “emotional manipulation,” arguing that such tests should not happen invisibly or that people should be afraid of the way sites like these can shape their lives.
These opaque, data-driven predictions of what news you’ll want from your network of friends, or who you might like to date, are scary in part because they have an element of self-fulfilling prophecy, a quality they share with more consequential “big data” predictions. In a civil rights context, automatic predictions can be particularly concerning because they may tend to reinforce existing disparities. For example, if historic arrest statistics are used to target law enforcement efforts, the history of over-enforcement in communities of color (which is reflected in these numbers) may lead to a system predicting more crime in those communities, bringing them under greater law enforcement scrutiny. Over time, minor crimes that occur in these communities may be prosecuted while the same crimes occurring elsewhere go unrecorded — leading to an exaggerated “objective” record of the targeted neighborhood’s higher crime rate.
But just because predictions are opaque — and just because they are self-fulfilling prophecies — does not necessarily mean that they’ll turn out to be bad for people, or harmful to civil rights. Computerized, self-fulfilling prophecies of positive social justice outcomes could be a key tool for civil rights.
One example of the opportunity for positive self-fulfilling prophecies may lie in education. There is strong evidence of self-fulfilling prophecies when students are told that they are likely or unlikely to do well in school. Those told they are likely to do better are, other things equal and just by dint of the prediction, likely to do better (a finding known as the “Pygmalion effect“). Meanwhile, when students are told that others like them have generally fared poorly, they become more likely to do poorly themselves (an effect known as “stereotype threat“).
Educational tools that harness big data to provide feedback to students and teachers could, potentially, harness these effects. For example, a new system called Course Signals uses data mining to predict the likely future performance of college students in a particular course, rating them as green (on track to do well) yellow, or red (likely to do poorly). The designers of a system like Course Signals might choose to use the self-fulfilling nature of its predictions as a force for good, perhaps by predicting slightly better performances by striving but low-performing students than the first-order data about such students would initially appear to justify. In a case like that, the opaque nature of the predictions, and the automatic deference that human beings often accord to predictive analytics, might be pressed into service to advance educational equity.
Increasingly pervasive, increasingly robust “big data” predictions may tend to entrench the status quo more thoroughly and more effectively than some of the pre-digital decision-making processes they are replacing. And that’s a reason for social justice advocates — whose most basic point is that some patterns still need to change — to be skeptical of such systems. But in the end, there are lots of goals that could be pursued through the manipulation of big data, and some of those goals may advance the social good.
In offering these thoughts, I mean to invite further scrutiny of these systems and the ways that we use them. Many uses of such systems may be socially harmful, and it might turn out in the end that we would be best off sharply limiting the role of opaque predictions in our lives. But we should open ourselves to a wider-ranging conversation, one that acknowledges the possibility of socially constructive as well as discriminatory uses of the hidden power of algorithms.
“But in the end, there are lots of goals that could be pursued through the manipulation of big data, and some of those goals may advance the social good.”
There is a strong ethical problem here. Who gets to decide what is a social good and what is not? How is the decision made? And when and on who is the algorithm to be applied? Transparency is vital to the proper functioning of a free and democratic society, and regardless of whether these algorithms are used “for the social good” or are bad, none of this is transparent, and is thus incompatible with that ideal.
“Many uses of such systems may be socially harmful .. but we should open ourselves to a wider-ranging conversation”
Sounds like every other tech discussion: it’s about _whose_ values the tech is being deployed to optimize, not really the tech _itself_. If they’re on your side, it’s likely to be good. If they’re an adversary, it’s likely to be bad. Up to now we’re mainly hearing about this being done by Facebook, and they’re nobody’s friend, so…
Ask a 1945 London firefighter and a 1969 Cape Canaveral worker, whether rocketry tech is a good idea, and you get different answers. Imagine that! 😉
Did you ever come across “The shockwave Rider” by John Brunner? One element is the “delphi oracle’, iirc,
which replaces the big data by many peoples opinions to arrive at conclusions. Might be interesting.
I get it that this is an accepted part of our modern lives: Social algorithms. I think incorporating data and digital statistics into our policy-making process adds to the legitimacy and transparency we are all expecting in a truly fair and equal society, and that it is ultimatrly up to the Users, the Human Beings interpreting, utilizing, and implementing that data in socially moral HUMAN ways.
Maybe we could work up a proposal that would fly in front of an IRB? Especially because we know that some opaque predictions are self-fulfilling, while others (say, about safety) are self-contradicting.