April 23, 2014

avatar

On Facebook Apps Leaking User Identities

The Wall Street Journal today reports that many Facebook applications are handing over user information—specifically, Facebook IDs—to online advertisers. Since a Facebook ID can easily be linked to a user’s real name, third party advertisers and their downstream partners can learn the names of people who load their advertisement from those leaky apps. This reportedly happens on all ten of Facebook’s most popular apps and many others.

The Journal article provides few technical details behind what they found, so here’s a bit more about what I think they’re reporting.

The content of a Facebook application, for example FarmVille, is loaded within an iframe on the Facebook page. An iframe essentially embeds one webpage (FarmVille) inside another (Facebook). This means that as you play FarmVille, your browser location bar will show http://apps.facebook.com/onthefarm, but the iframe content is actually controlled by the application developer, in this case by farmville.com.

The content loaded by farmville.com in the iframe contains the game alongside third party advertisements. When your browser goes to fetch the advertisement, it automatically forwards to the third party advertiser “referer” information—that is, the URL of the current page that’s loading the ad. For FarmVille, the URL referer that’s sent will look something like:

http://fb-tc-2.farmville.com/flash.php?…fb_sig_user=[User’s Facebook ID]…

And there’s the issue. Because of the way Zynga (the makers of FarmVille) crafts some of its URLs to include the user’s Facebook ID, the browser will forward this identifying information on to third parties. I confirmed yesterday evening that using FarmVille does indeed transmit my Facebook ID to a few third parties, including Doubleclick, Interclick and socialvi.be.

Facebook policy prohibits application developers from passing this information to advertising networks and other third parties. In addition, Zynga’s privacy policy promises that “Zynga does not provide any Personally Identifiable Information to third-party advertising companies.”

But evidence clearly indicates otherwise.

What can be done about this? First, application developers like Zynga can simply stop including the user’s Facebook ID in the HTTP GET arguments, or they can place a “#” mark before the sensitive information in the URL so browsers don’t transmit this information automatically to third parties.

Second, Facebook can implement a proxy scheme, as proposed by Adrienne Felt more than two years ago, where applications would not receive real Facebook IDs but rather random placeholder IDs that are unique for each application. Then, application developers can be free do whatever they want with the placeholder IDs, since they can no longer be linked back to real user names.

Third, browser vendors can give users easier and better control over when HTTP referer information is sent. As Chris Soghoian recently pointed out, browser vendors currently don’t make these controls very accessible to users, if at all. This isn’t a direct solution to the problem but it could help. You could imagine a privacy-enhancing opt-in browser feature that turns off the referer header in all cross-domain situations.

Some may argue that this leak, whether inadvertent or not, is relatively innocuous. But allowing advertisers and other third parties to easily and definitively correlate a real name with an otherwise “anonymous” IP address, cookie, or profile is a dangerous path forward for privacy. At the very least, Facebook and app developers need to be clear with users about their privacy rights and comply with their own stated policies.

Comments

  1. rp says:

    At the very least, Facebook and app developers need to be clear with users about their privacy rights and comply with their own stated policies.

    No, they don’t. That would impair the value of the information that they’re passing to advertisers.

    If there were some penalty for failure to carry out their own stated policy — say, a requirement to disgorge all revenues obtained by the sale of personally-identifying information provided by users relying on their fraudulent claims of non-disclosure, or prosecution for violating data-protection laws — there might be some reason for a change. But as things stand now, the status quo is most profitable.

  2. Anonymous says:

    There are other serious privacy violations by Facebook. For example, section 3.3 in this paper – http://saikat.guha.cc/pub/imc10-ads.pdf – if true, has serious implications. The researchers created different profiles on Facebook where the personal information like gender, sexual orientation, etc differ between the profiles but are kept confidential, and they found that the ads shown on the profiles differ significantly.

  3. OMD says:

    I work in online marketing and two things are clear about Facebook. First, they really just don’t care very much about privacy. I don’t mean flagrantly don’t care, I just mean that it’s not a serious concern. The PR people shed some crocodile tears when it’s required and have Zuck sit around on stage giving awkward interview in his hoodie, maybe make a donation, but that’s not indicative of anything, it’s just PR. Second, the company has a long history, well known in the industry, of pushing the envelope really, really far for what is supposed to be a legitimate company and for repeatedly exposing information, overriding user settings and generally being fairly careless with user data.

    The amazing thing are the number of people who are either dazzled by “high tech” or are just digerati bobble heads who seem to have an emotional stake in carrying water for the company and Zuck. It’s not a an evil or good corporation, it’s a discombobulated corporation that growing faster than it can keep up at times and that has a completely amoral business ethos.

    They’ll continue to have privacy gaffes because there’s virtually no repercussion for having them and because, on occasion, it makes business sense for them to open up more information to the outside world either for internal marketing reasons or for channel marketing reasons having to do with advertisers desires.

    The one thing that plays absolutely no part is the literary nonsense spouted by a number of hoodwinked reporters about some kind of revolution in online privacy or whatever other ephemera you hear on Charlie Rose and from the outside-the-industry tech journalists that rewrite press releases for the hype mags or pen heady fables for the old line print outfits. The former just want traffic and the latter just want there to be a narrative they can hitch their star to, but there is no simplistic, fairy-tale narrative. It’s a combination of chaos and amoral, cold business calculation, nothing more. And it will happen again and again until they actually face some real consequence besides a few thousand people canceling accounts (surely to mostly return) – something they care about only enough to pay lip service, before getting caught in yet another privacy blow up. It’s really a good thing they’re a US corporation, Europe’s consumer protection outfits would have a field day with them.

  4. idlemind says:

    There are very good reasons why Facebook doesn’t want info to leak to advertisers, and it has to do with how they sell advertising. The second commenter has illustrated why: ads are targeted to specific “demographics” including not just age and gender, but various interests as exposed by online behavior. But this data isn’t sold; the commenter is right that such a privacy violation would widely be considered unethical and, in some places, illegal. What Facebook, Google, Yahoo, and so on do is charge advertisers money to show an ad to specific groups, but the ads are delivered in such a way that the advertiser doesn’t know the identity of those viewers the ad is shown to. They do this by controlling the server the ad is delivered from, or by setting up an anonymous handoff to the advertiser’s servers and placing the result in a designated spot (an “iframe”) on the screen.

    A lot of people think that these companies sell identities and demographic info, but in fact what they are doing is letting advertisers rent access to a given anonymous demographic. Unlike selling the info, this model generates a continuous revenue stream. If advertisers could target you directly, they would be unwilling to pay the rates Facebook, et al, are charging them for targeting. The difference in price for displaying a highly targeted ad can literally be a hundred times that of an untargeted ad. Advertisers have their own ID for you attached to your browser (check out your “cookie file” if you don’t believe me), and if they could link that with the demographic info Facebook has, they could do all the targeting themselves and not pay Facebook for anything but untargeted ad space.

    So in this case I’d think Facebook would be highly motivated to prevent apps from leaking identities to advertisers. Although it’s not like a direct transfer of the targeting info, bits and pieces would be added to advertisers’ own databases over time — which cuts into the lucrative revenue stream Facebook gets from targeted ad buys. But rejiggering the app platform to make it impossible for apps to leak IDs would be pretty difficult and would encounter a lot of grief from app developers. It will be interesting to see what Facebook comes up with.

  5. Anonymous says:

    There’s a simple solution for users:
    Don’t use these apps!

    It’s pretty obvious from the get-go that the whole purpose of the apps is to gather information for advertisers. For instance, all those questions about which movie star you most resemble or which city most personifies you – the answers to the questions help develop a psychological profile of you that is more valuable than demographic information, because it speaks volumes for the kind of consumer you are.

    I brought this up with several of my FB friends who kept trying to involve me in their games & such, and they all said they have nothing to hide.

    Dumb. Really really dumb.

  6. Udi Barone says:

    Here is an idea how to solve it all?

    The problem is that today to run well-targeted ads you must have user data and also the smart algorithms to match the most relevant ad based on this user data. Facebook has both capabilities and the ad-networks need to compete with that. Moreover at the end of the day pay the app developer (publisher) while making highest profit themselves.

    Problem – lose ends

    Today on one hand the app developer has access to the user data BUT he is forbidden to pass it to anyone! On the other hand the ad-networks have the smart algorithms but have no access to the user-data.

    Solution – empower the app dev and decouple the user data and the 3rd parties

    Give the app developers the targeting technology capable to analyze the Facebook user data in real-time and match it with the more relevant ad. This way, if we have for example a user interested in sports, the app can get to this conclusion by its own (using its algorithms) and go request the ad from the 3rd party ad-network by passing it just a tag indicating “sports”.

    The benefits of such solution is that the user data is accessed by the app only, the data doesn’t go to any 3rd party, the user can opt-out by blocking the app and in instant have no privacy issues.

    If you are interested to hear more about such technologies please feel free to contact me

  7. AnonLurker says:

    Can I request the folks who discovered this (or anyone else who has time on their hands to verify it and notify Facebook) to file a complaint with TRUSTe? Facebook is certified by TRUSTe, and as a condition of participation in TRUSTe’s program, is required to comply with their own privacy policy. TRUSTe investigates complaints and accusations that companies they certify are violating their own privacy policy.

    To file a complaint, go to truste.com and click on “File a Watchdog Complaint”.

  8. anononymous says:

    http://techcrunch.com/2010/10/20/google-facebook-disconnec/

    Called facebook disconnect, and its supposed to stop this sort of thing.

    It appears to work but if anyone could do some more verification thatd be pretty sweet!