The research community is buzzing about the ethics of Facebook’s now-famous experiment in which it manipulated the emotional content of users’ news feeds to see how that would affect users’ activity on the site. (The paper, by Adam Kramer of Facebook, Jamie Guillory of UCSF, and Jeffrey Hancock of Cornell, appeared in Proceedings of the National Academy of Sciences.)
The main dispute seems to be between people such as James Grimmelmann and Zeynep Tufecki who see this as a clear violation of research ethics; versus people such as Tal Yarkoni who see it as consistent with ordinary practices for a big online company like Facebook.
One explanation for the controversy is the large gap between the ethical standards of industry practice, versus the research community’s ethical standards for human subjects studies.
Industry practice allows pretty much any use of data within a company, and infers consent from a brief mention of “research” in a huge terms of use document whose existence is known to the user. (UPDATE (8:30pm EDT, June 30 2014): Kashmir Hill noticed that Facebook’s terms of use did not say anything about “research” at the time the study was done, although they do today.) Users voluntarily give their data to Facebook and Facebook is free to design and operate its service any way it likes, unless it violates its privacy policy or terms of use.
The research community’s ethics rules are much more stringent, and got that way because of terrible abuses in the past. They put the human subject at the center of the ethical equation, requiring specific, fully informed consent from the subject, and giving the subject the right to opt out of the study at any point without consequence. If there is any risk of harm to the subject, it is the subject and not the researchers who gets to decide whether the risks are justified by the potential benefits to human knowledge. If there is a close call as to whether a risk is real or worth worrying about, that call is for the subject to make.
Facebook’s actions were justified according to the industry ethical standards, but they were clearly inconsistent with the research community’s ethical standards. For example, there was no specific consent for participation in the study, no specific opt-out, and subjects were not informed of the potential harms they might suffer. The study set out to see if certain actions would make subjects unhappy, thereby creating a risk of making subjects unhappy, which is a risk of harm—a risk that is real enough to justify informing subjects about it.
The lead author of the study, Adam Kramer, who works for Facebook, wrote a statement (on Facebook, naturally, but quoted in Kashmir Hill’s article) explaining his justification of the study.
[…]
The reason we did this research is because we care about the emotional impact of Facebook and the people that use our product. We felt that it was important to investigate the common worry that seeing friends post positive content leads to people feeling negative or left out. At the same time, we were concerned that exposure to friends’ negativity might lead people to avoid visiting Facebook. We didn’t clearly state our motivations in the paper.
[…]
The goal of all of our research at Facebook is to learn how to provide a better service. Having written and designed this experiment myself, I can tell you that our goal was never to upset anyone. I can understand why some people have concerns about it, and my coauthors and I are very sorry for the way the paper described the research and any anxiety it caused. In hindsight, the research benefits of the paper may not have justified all of this anxiety.
This misses the point of the objections. He justifies the research by saying that the authors’ intention was to help users and improve Facebook products, and he expresses regret for not explaining that clearly enough in the paper. But the core of the objection to the research is that the researchers should not have been the ones deciding whether those benefits justified exposing subjects to the experiment’s possible side-effects.
The gap between industry and research ethics frameworks won’t disappear, and it will continue to cause trouble. There are at least two problems it might cause. First, it could drive a wedge between company researchers and the outside research community, where company researchers have trouble finding collaborators or publishing their work because they fail to meet research-community ethics standards. Second, it could lead to “IRB laundering”, where academic researchers evade formal ethics-review processes by collaborating with corporate researchers who do experiments and collect data within a company where ethics review processes are looser.
Will this lead to a useful conversation about how to draw ethical lines that make sense across the full spectrum from research to practice? Maybe. It might also breathe some life into the discussion about the implications of this kind of manipulation outside of the research setting. Both are conversations worth having.
UPDATE (3pm EDT, June 30, 2014): Cornell issued a carefully worded statement (compare to earlier press release) that makes this look like a case of “IRB laundering”: Cornell researchers had “initial discussions” with Facebook, Facebook manipulated feeds and gathered the data, then Cornell helped analyze the data. No need for Cornell to worry about the ethics of the manipulation/collection phase, because “the research was done independently by Facebook”.
Was in in the user agreement that they cab do something like that?
Why to test something that is known? When making test why not choose all positive and another dimension positive pairs. Besides…Opposites to easy
I think the FB-is-allowed-to-modify-its-algorithms-anytime-so-wheres-the-ethical-problem-with-this-way argument is sophistry.
Users assume (apparently naively) that FB modifies its news feed algorithm in such a way as to improve the experience (a win for for the user). They are willing to put up with ongoing changes and sub-optimal actual experience, on the assumption, that FB is acting in good faith. They do not expect the algorithm to systematically choose to harm them (which is what they get if they happen to be on the downer side of the experiment).
Suppose I stand out on my street corner and hand out candy to anyone walking by. Today it’s lemon, tomorrow it’s cherry, another day grape, but they’re all colored blue, so you don’t know what you get until you taste it. You may like one better than the other, but they’re pretty much all more or less OK. My friend in the house across the street takes pictures of everyone, and analyzes whether lemon or cherry is preferred. We do this for months at a time. Then one day I decide to run an experiment. Some folks I give cherry, and some folks I give cherry laced with quinine (bitter but fairly harmless for most people). Then we watch what happens. I think most people would agree that that would be a pretty nasty trick.
I don’t think it’s necessarily true that either experimental algorithm was designed with the intent of causing even minimal harm. The whole point of the research, so far as I’ve been able to gather, was that we didn’t know – some people thought emphasizing the positive would result in an improved experience (“emotional contagion”) some thought emphasizing the negative would be superior (“makes my life seem better by comparison”) and nobody had much evidence either way.
(I think your earlier point was spot-on, by the way.)
Well, I think it was pretty forseeable that for some folks who were already on edge, feeding them more bad news might have a serious impact.
There’s no point in doing the experiment if you are certain of the outcome. If you are certain nobody’s going to get more negative, you don’t do the study. If you do the study, it is because you believe that some people may become more negative (or more positive), and it is forseeable that more negativity may be harmful.
I get the you’d-mess-up-the-results-by-providing-full-informed-consent thing, but then, at the VERY LEAST, after the study is over, the experimenter should let the subjects know…oh…by the way…if you happen to feel lousy…well…maybe it’s not you…I was experimenting on you…here’s what I did…hope that helps ya get the treatment ya need.
Also, I believe that experimenters are generally on the hook for providing some kind of treatment for damage inflicted by the experiment. I haven’t seen any mention of any mechanism for following up on these subjects and providing necessary treatment. In this particular study, probably it’s not a big deal, but I don’t see FB as shying away from more dangerous studies along similar lines, where it will be more unconscionable to leave subjects to pick up the pieces on their own, with no information about what just hit them.
Yes, deception-based IRB approvals are certainly possible, but tricky.
Has anyone addressed the question of the alternative possibility of retrospective analysis? Given the size of their dataset, Facebook could have mined for sequences involving predominantly positive or negative affect prior to posting. This would have given them the data they needed without manipulation, thus avoiding some of the issues.
Ed:
I’m generally of the belief that this behavior is to be expected of companies like Facebook, and that the differences between research institutions and private companies are the key concern here.
That said, I’m wondering about what would be involved in meaningful informed consent here. Although I haven’t heard this argument made, FB might have argued (even to a University IRB) that any informed consent discussing the planned intervention might have jeopardized the ecological validity, in that participants would be biased by the terms of the consent. This, of course, might lead to a subsequent discussion with the IRB about the appropriateness of such a study.
I do fear a backlash from this discussion – my guess is that this sort of research will continue unabated at Facebook and other organizations, but they will keep it all internal, foregoing the stickiness of IRBs and public discussion. This would avoid further controversy, but it might not be a great resolution.
Researchers sometimes have a convincing argument that it is necessary to withhold some information during the consent process, or even to deceive subjects. Those claims of necessity need to be justified, and standard protocols require tell the subjects afterward about what was really going on. It’s possible that these researchers could have gotten clearance to withhold information, but it seems unlikely that they could have gotten clearance to proceed without asking for some kind of specific consent (even if based on limited disclosure), or without informing subjects about the experiment afterward.
Yes, the problem was that they revealed themselves and their’ motive’s and wrong doings. That they ruthlessly manipulated people who don’t happen to know the garbage, using, and abuse common in the business world is perfectly fine. It reminds me of some time ago when the vatican basically treated the ongoing molestation revelations as a PR issue. Clearly we would all be better off if no one ever found out and if no one new that being molested was a gross violation and abuse, and something that is sadly very common. It’s great to live in a society where people question if this is really a problem. Welcome to the society of sociopaths and the autistic where the fact that you should expect to be abused makes abuse OK and it’s your’ own fault for not knowing everyone is an abusive con man….
Yeah, irony, it’s too disturbing for any other approach.
That presumes that changing an algorithm amounts to “ruthless manipulation” and/or “abuse”. I just don’t see it.
We’ve got three algorithms; the one they normally use (A), and two experimental ones (B and C). You’re defining A as “perfectly fine” and B and C as “ruthless manipulation”.
It could just as easily have happened that B was the algorithm in normal use; the only differences, after all, amount to choices made at a time when the significance of those choices was presumably not at all well understood. In some alternate universe, FB made different choices, and B became the standard.
In which case this experiment might have tried A and C, and presumably you’d then be complaining that B was “perfectly fine” and that A and C were “ruthless manipulation”. How can the only distinction between which algorithm is acceptable and which is manipulative amount to a historical accident?
Another (slightly different) take on this: http://xkcd.com/1390/
I’m not entirely convinced that the objections (except perhaps the “laundering” allegations against Cornell) make sense.
Clearly FB are entitled to change their algorithms if they think they may have come up with a better approach; it is perfectly normal to roll out such changes piecemeal to reduce the risks of major catastrophe; and it only makes sense to monitor the results to decide whether or not the change was in fact an improvement.
So it pretty much seems that the only problem is that they published their results?
But that’s not what happened. It is not the case here that Facebook thought they had come up with a better approach and wanted to roll out those changes piecemeal to avoid the risks of a catastrophe. What they did here is to subject two groups of users to opposite stimuli, to see what happened.
The distinction seems minimal. Given three possible approaches (the status quo and the two opposite variants) and some doubt about which is best, why not try all three and see how they do?
On the other hand, if it is clear that FB never had any intention of changing to either of the alternate approaches no matter what the results were, i.e., the *only* reason to do it was “to see what happened” as you put it, then yes, that’s academic research and it makes sense to apply the relevant ethical standards.
I do also have some vague ethical concerns (speaking with my IT hat on) which I think hover around the need for reproducability, but they’re too inchoate to say much about at the moment.
It occurs to me…I don’t know why we should all be so bent out of shape about Facebook’s little emotions experiment…I mean…this is core FB functionality. Its whole business model is to shape our opinions in favor of buying various products or ideas, so why should we be surprised? We’re just offended because clearly our shopping decisions are rational, and emotions are…well…special.
(Irony alert)
It is worth looking into anthropology’s code of ethics. If the experiment would qualify as such, it would unequivocally violate them.
I agree with Zeynep’s conclusions that the implications of these practices are much larger than the research. If Facebook discovers that reading about bad things makes many people (or even person X in particular), less likely to engage with the platform, it seems likely that they would not show (as many of) such posts. I’m not sure censorship is the right word for this activity, but I wonder how the manipulation of media people can see fits into a context of what historical media companies have done in the past – is there any writing on this? Also, how far does that manipulation strategy logically extend? Do those who use Facebook as an outlet for negative emotions or to talk about horrible things in the world get isolated to some corner of the internet while the rest of the soma-consuming people look at GIFs of kittens? Or does the ability of Facebook to personalize this manipulation not really add anything to what print, radio, and video media producers have always done with information?
Interested to hear people’s thoughts!
Just reading up on all this on several sites… Finally a comment that mirrors my own thoughts.
The lead author of the study, Adam Kramer’s explanation for why the study was conducted is the most disturbing thing of all in the revelations that such an experiment was conducted in the name of research. Just what exactly will Facebook and other social media sharing sites do with such information? Will they create algorithms intended to manipulate our social interactions with friends and family for the sake of their profit? Most certainly that is the goal as stated by Adam Kramer when he said:
“We felt that it was important to investigate the common worry that seeing friends post positive content leads to people feeling negative or left out. At the same time, we were concerned that exposure to friends’ negativity might lead people to avoid visiting Facebook. We didn’t clearly state our motivations in the paper.”
So, if Facebook feels my friends’ positive content may cause me to feel negative or left out then I may avoid Facebook (thus reducing their advertising potential) so the algorithm will filter out positive news that my friends attempt to share with me; and, if my friends’ negative content may cause me to avoid Facebook, again, Facebooks’ algorithm censors what it allows my friends to share with me so that I won’t avoid seeing Facebook advertising.
Later Adam states: “The goal of all of our research at Facebook is to learn how to provide a better service.” Were Adam sincere about wanting to clearly state their motivations he would have stated: “The goal of all of our research at Facebook is to learn how to provide a better service for our advertisers.”
also in interesting quote form a linguistics expert:
” I’m sceptical of it being anything meaningful, given that they used LIWC which is completely dependent on people using words in the manually-defined LIWC dictionary, and spelling them in the way the LIWC authors expected. And we all know how standard people’s vocabulary & spelling is on social media.
(We recently tested LIWC on our xxxxxxx & found it 30-40% accurate. Or rather, 60-70% bxxxxxxs)
And never mind the ethics committee, how did the reasoning “use affective text -> experience the same emotions” get past any reviewers?”
I guess PNAS clearly wanted to beat Nature and Science in a PR move.
Research ethics aside, this story also shines a light on two huge gaps: one between what most users understand about what Facebook does with their posts and their Newsfeeds, and the reality of Facebook’s practices; and the other between consumers’ understanding and data scientists’ perception of that understanding. Like many researchers who work in the tech field, Tal Yarkoni made mistaken assumptions about latter gap–and he acknowledged that in a later tweet: “joking aside, one thing I had very wrong in my post: a surprising number of people clearly *didn’t* realize this stuff was going on.” Do data scientists have an ethical duty to explore, first, what their subjects understand and can consent to?
And note that this is currently the frontrunning explanation for what happened in this particular case. Kashmir Hill’s latest article has a claim from the PNAS reviewer that the academic authors received IRB approval only for “analysis of a pre-existing dataset”, evading IRB ethics review of the intervention itself.
James Grimmelmann points out on Twitter that this “IRB laundering” logic would appear to ethically justify a “brick manipulation study”, where the study was “Hit people with bricks to see if they bleed”, if being hit with bricks were allowed by the Facebook Data Use Policy.
Yes, I picked up the term “IRB laundering” from James. I was hoping to say more about that topic in the main post but ran out of space.
See also Michelle Meyer’s analysis, which goes more into the ins and outs of IRB procedures and rules.
My main issue wasn’t with the research ethics per se! I’m in the camp that sees it as “ordinary practices for a big online company like Facebook.”
My main concern is what these ordinary practices mean in general, socially and politically. So, I want to do what you say in penultimate sentence: “breathe some life into the discussion about the implications of this kind of manipulation outside of the research setting.”