August 15, 2018

Ancestry.com can use your DNA to target ads

With the reduction in costs of genotyping technology, genetic genealogy has become accessible to more people. Various websites such as Ancestry.com offer genetic genealogy services. Users of these services are mailed an envelope with a DNA collection kit, in which users deposit their saliva. The users then mail their kits back to the service and their samples are processed. The genealogy company will try to match the user’s DNA against other users in its genealogy and genetic database. As these services become more popular, we need more public discourse about the implications of releasing our genetic information to commercial enterprises.

Given that genetic information can be very sensitive, I found that the privacy policy of Ancestry’s DNA services has some surprising disclosures about how they could use your genetic information.

Here are some excerpts with the worrying parts in bold:

Subject to the restrictions described in this Privacy Statement and applicable law, we may use personal information for any reasonable purpose related to the business, including to communicate with you, to provide you information about Ancestry’s and AncestryDNA’s products and services, to respond to your requests, to update our product offerings, to improve the content and User experience on the AncestryDNA Website, to help you and others discover more about your family, to let you know about offers of interest from AncestryDNA or Ancestry, and to prepare and perform demographic, benchmarking, advertising, marketing, and promotional studies.

To distribute advertisements: AncestryDNA strives to show relevant advertisements. To that end, AncestryDNA may use the information you provide to us, as well as any analyses we perform, aggregated demographic information (such as women between the ages of 45-60), anonymized data compared to data from third parties, or the placement of cookies and other tracking technologies… In these ways, AncestryDNA can display relevant ads on the AncestryDNA Website, third party websites, or elsewhere.

The privacy policy gives Ancestry permission to use its users’ genetic information for advertising purposes. When I inquired with Ancestry, they pointed to the following part of their privacy policy:

We do not provide advertisers with access to individual account information. AncestryDNA does not sell, rent or otherwise distribute the personal information you provide us to these advertisers unless you have given us your consent to do so.

However, it is not clear how your personal information can be used to display “relevant ads” unless either Ancestry operates as an ad network itself or Ancestry communicates some personal information to third party advertisers in order to target the ads. Below, I expand on concerns raised by this privacy policy:

Users may “consent” to the use of their genetic data unknowingly. The privacy policy says Ancestry can distribute users’ private information if Ancestry gets permission first. That permission could be granted by a dialog that users click through without much thought. Research has shown that users are already desensitized to privacy and security warnings.

Even if only Ancestry is using the personal information to target ads, the data might accidentally find its way to third parties. Researchers have demonstrated how it can be difficult to avoid information leakage through URLs or cookies or more sophisticated attacks. If Ancestry categorizes its users according to their genetic traits and then stores and transfers these categories in cookies and URL parameters (a common practice for the analogous “behavioral segment” categories used for many targeted ads), then the genetic data can easily leak to third parties.

The genetic data collected by these services may endanger the privacy of users and their families. A genome is not something easily made unlinkable. Only 33 bits of entropy are necessary to uniquely identify a person. The DNA profiles used by law enforcement in the US today take samples from 13 location on the genome, and have about 54 bits of entropy. The test that Ancestry uses samples 700,000 locations on the genome, which will likely have much more than 33 bits of entropy. In fact, I believe this is enough entropy to compromise not only an individual’s privacy, but also the privacy of family members. With the 13 CODIS locations, law enforcement can already do familial searches for close family members. I hope to touch on the familial aspects of DNA privacy at a later date. The compromise of familial privacy is in part what makes collecting and distributing DNA even more sensitive that just collecting an individual’s full name or address.

Genetic data can be used to discriminate against people on the basis of characteristics they cannot control. More than identity, DNA data may allow someone to infer behavior and health attributes. Major concerns about the impact of genetic information on employment and health insurance led Congress to pass the Genetic Information Nondiscrimination Act, which makes it illegal to use genetics to decide hiring or health insurance pricing. However, GINA may not effectively deter people who 1) are not employers or insurers (e.g., landlords discriminating in their choice of tenants, which is prohibited by California state law but not by the federal provisions in GINA); 2) do not believe they will be caught; or 3) are not aware that they are discriminating, as discussed next.

Unintentional discrimination may occur. The big data report from the White House warns that the “increasing use of algorithms to make eligibility decisions must be carefully monitored for potential discriminatory outcomes for disadvantaged groups, even absent discriminatory intent.” An algorithm that takes genetic information as an input likely will lead to results that differ based on genes. This outcome already discriminates on the basis of genetics, and because genes are correlated with other sensitive attributes, it can also discriminate on the basis of characteristics such as race or health status. The discrimination occurs whether or not the algorithm’s user intended it.

Comments

  1. Very interesting article. I’m fascinated by the lightning fast morphing transformation of our DNA data gathering around the globe and its potential detrimental uses. More please.

  2. I think you’re reaching quite a bit here. Furthermore, I think there is a gathering cottage industry of fear-mongering over “DNA” going on around the web, and I call it fear-mongering because I have yet to see someone give an example of a person who really has been aggrieved through one of these genealogy DNA tests in the manner given in the scary scenarios.

    The only thing that has happened to date is people discovering they have relatives they never thought they had. Of course the actual DNA test-taker has a “discovery risk” (e.g., not being the child of whom they thought were their biological parent) but that is a pretty clear risk up front to whomever wants to dive into this kind of DNA testing.

    As any user of ancestry.com can attest, ancestry.com is always sending out marketing emails in an attempt to get you to use one of their associated products (like Fold3.) That is true of AncestryDNA too, as ancestry.com will try to get you to buy additional tests if you have already bought one (I know first hand.) AncestryDNA does not provide a back-end API to anyone’s data, so there is no risk of disclosure of genotype data. I have yet to even knowingly experience AncestryDNA selling my name to some other company who wants to market additional products (in contrast to, say, being inundated by Facebook promotions or Google targeting adverts to me.)

    • “AncestryDNA does not provide a back-end API to anyone’s data, so there is no risk of disclosure of genotype data.”

      The same could be said: “Ashley Madison does not provide a back-end API to anyone’s data, so there is no risk of disclosure of adulterous data.”

      And yet how absurd would that statement be today? I distrust “targeting advertising” but that is generally the least of my worries. It’s more of the insecurity of just handing over tons of data. I would NOT choose to do the DNA thing because of the insecurity of their data. And let’s face it; in reality they are NOT any more secure than AM was; and likely even less so; because they don’t think about the implications of the data they collect. Their privacy statement is a boilerplate statement; one designed by lawyers to keep them from having legal problems over their use of private information. It really grants them the “authority” (or “legal immunity”) to do anything they want which is whatever makes them money.

      As for the stated purpose of advertising, yes, right now they only try to sell their products to users; because that makes them more money. Until just like doctors and insurance companies they find they can make even more money if they sell that data to others. Even though it is supposedly “against the law” for doctors or insurance companies to do it; there is loopholes in the laws that allow them to so long as they put it in their (in healthcare parlance) HIPAA Disclosure Statement; which in the computer industry is the same as a “Privacy Policy.”

      I signed up for either Ancestry.com or a sister sight; to help my mother print off some stuff (since web developers forget that many older folks like hard copies); and I know how to get around coder’s inability to create printable web pages. Even though all I provided was my name and e-mail address I commonly get advertisements from them to buy things; eventually I expect to have to expunge that e-mail address when I start to get advertisements from other groups; as will happen eventually. Might take a few years; but as soon as Ancestry.com realizes there is no legal penalties for sale of private and personal information they WILL sell it. They already say they may; even while they claim they wont.

      But the real concern is what people with nefarious desires will do when they get a hold of the data. Ancestry.com is not secure as they even claim.

    • P.S. I want to clarify two sentences of my post. “in reality they are NOT any more secure than AM was; and likely even less so; because they don’t think about the implications of the data they collect. Their privacy statement is a boilerplate statement”

      This should read:

      In reality, I can properly infer that Ancestry.com is even LESS secure than AM was. Whereas AM knew the data they held would have certain implications if it were disclosed and thus attempted to secure their data; Ancestry.com apparently does not even realize the implications of disclosure of DNA data. This inference can be made from several points; but the largest of which is that rather than detail their acknowledgement of need for privacy of DNA data in their Privacy Policy they settled for a boilerplate language used by any company that plans to make money via advertisements and marketing. When a company does not even think about privacy and security they would never bother to actually even try to secure their data.

  3. In the Details says:

    I think it’s not quite as clear as it seems – and is probably somewhere in between Paul’s view and Puzzled. The difference hinges on a distinction not make clear in the piece.

    AncestryDNA is saying they can use your information – including DNA information – for targeting advertising and such. They’re most likely not providing the actual DNA information to ad networks – rather they’re using your DNA information to lump you into categories for advertising purposes and letting the ad networks know which categories you belong to.

    So the key question on that issue becomes what categories they use and how finely grained they are.

  4. Jack Peterson says:

    Ancestry. com can also provide your mothers maiden to anyone who is interested.

    • Hence why you should never pick “My mother’s maiden name” to any supposed “security question” on any website, or at least never answer such question with a real answer. Or, any other publicly available data for that matter. Most websites don’t think about the reality that most of their predetermined questions are actually information that is publicly known or publicly available.

      The PROPER way to do such security challenge and response (and as of yet, no website that I have ever run across actually does this) is to have the user create the challenge and the response (both the question and the answer); so the user can pick something that he/she knows is actually personal private information rather than something of public record.

      How come no one follows “best practices” for security in the computer industry? Same as I said above to Puzzled; because they don’t THINK about the implications of the data; rather they just assume they know what to do based on what they have seen in other places.

  5. Thank you for another magnificent post. The place else may just anyone get that type of info in such an ideal approach of writing?
    I’ve a presentation subsequent week, and I’m at the look for such info.

  6. Guessing I’m the only one here who works in the digital ad industry. If you look at any company’s ad policy, it says the same thing as the author points out here. There’s no way for anyone to use your DNA to match you to ads. Think about the big companies in advertising and data: Google, Facebook, Yahoo, Experian, etc. All this talks about is personal information to target ads based on common targeting data- demographics: age, geo, gender, behavioral (ads you click on, searches you make, sites you visit, etc). If you’re that concerned about ads, don’t use Google, Yahoo or Bing. Or better yet, do a “Ron Swanson” and toss your computer out the window. Relevant link: https://www.youtube.com/watch?v=ZP7ebyqBn_M