May 25, 2017

Archives for September 2015

Freedom to Tinker on the Radio

Today on the Canadian Broadcasting Corporation’s CBC Radio show, “The Current”, a 20-minute segment about the freedom to tinker:

“Arrested, for tinkering.  Young Ahmed Mohamed likes to take things apart, cross wires, experiment… and put things back together again. It’s the kind of hobby that once led to companies like…say, Apple and Microsoft. But is a security-centric culture interfering with the freedom to tinker?”

Radio host Piya Chattopadhyay interviews three panelists:

  • Lindy Wilkins, community technologist and the co-founder of Make Friends, a monthly meet-up of makers and community organizers in Toronto,
  • Alexandra Samuel, independent technology researcher in Vancouver who is working on a book about Tinkering and education for kids,
  • Andrew Appel, Professor of Computer Science at Princeton University and blogger at Freedom-to-Tinker.

When I was Ahmed’s age, back in 1973, I read this really cool article in Scientific American’s Amateur Scientist column, about how to use TTL integrated circuit components to make, for example, a clock.  So I went to Radio Shack to buy the parts, I learned how to use a soldering iron, and I built a clock.

Didn’t get arrested.  Was that because I was white, because I went to a school where the teachers had some sense, because it was before 9/11 and mass school shootings, or all of the above?

“Private blockchain” is just a confusing name for a shared database

Banks and financial institutions seem to be all over the blockchain. It seems they agree with the Bitcoin community that the technology behind Bitcoin can provide an efficient platform for settlement and for issuing digital assets. Curiously, though, they seem to shy away from Bitcoin itself. Instead, they want something they have more control over and doesn’t require exposing transactions publicly. Besides, Bitcoin has too much of an association in the media with theft, crime, and smut — no place for serious, upstanding bankers. As a result, the buzz in the financial industry is about “private blockchains.”

But here’s the thing — “private blockchain” is just a confusing name for a shared database.

The key to Bitcoin’s security (and success) is its decentralization which comes from its innovative use of proof-of-work mining. However, if you have a blockchain where only a few companies are allowed to participate, proof-of-work doesn’t make sense any more. You’re left with a system where a set of identified (rather than pseudonymous) parties maintain a shared ledger, keeping tabs on each other so that no single party controls the database. What is it about a blockchain that makes this any better than using a regular replicated database?

Supporters argue that the blockchain’s crypto, including signatures and hash pointers, is what distinguishes a private blockchain from a vanilla shared database. The crypto makes the system harder to tamper with and easier to audit. But these aspects of the blockchain weren’t Bitcoin’s innovation! In fact, Satoshi tweaked them only slightly from the earlier research that he cites in his whitepaper research by Haber and Stornetta going all the way back to 1991!

Here’s my take on what’s going on:

  • It is true that adding signatures and hash pointers makes a shared database a bit more secure. However, it’s qualitatively different from the level of security, irreversibility, and censorship-resistance you get with the public blockchain.
  • The use of these crypto techniques for building a tamper-resistant database has been known for 25 years. At first there wasn’t much impetus for Wall Street to pay attention, but gradually there has arisen a great opportunity in moving some types of financial infrastructure to an automated, cryptographically secured model.
  • For banks to go this route, they must learn about the technology, get everyone to the same table, and develop and deploy a standard. The blockchain conveniently solves these problems due to the hype around it. In my view, it’s not the novelty of blockchain technology but rather its mindshare that has gotten Wall Street to converge on it, driven by the fear of missing out. It’s acted as a focal point for standardization.
  • To build these private blockchains, banks start with the Bitcoin Core code and rip out all the parts they don’t need. It’s a bit like hammering in a thumb tack, but if a hammer is readily available and no one’s told you that thumb tacks can be pushed in by hand, there’s nothing particularly wrong with it.

Thanks to participants at the Bitcoin Pacifica gathering for helping me think through this question. can use your DNA to target ads

With the reduction in costs of genotyping technology, genetic genealogy has become accessible to more people. Various websites such as offer genetic genealogy services. Users of these services are mailed an envelope with a DNA collection kit, in which users deposit their saliva. The users then mail their kits back to the service and their samples are processed. The genealogy company will try to match the user’s DNA against other users in its genealogy and genetic database. As these services become more popular, we need more public discourse about the implications of releasing our genetic information to commercial enterprises.

Given that genetic information can be very sensitive, I found that the privacy policy of Ancestry’s DNA services has some surprising disclosures about how they could use your genetic information.

Here are some excerpts with the worrying parts in bold:

Subject to the restrictions described in this Privacy Statement and applicable law, we may use personal information for any reasonable purpose related to the business, including to communicate with you, to provide you information about Ancestry’s and AncestryDNA’s products and services, to respond to your requests, to update our product offerings, to improve the content and User experience on the AncestryDNA Website, to help you and others discover more about your family, to let you know about offers of interest from AncestryDNA or Ancestry, and to prepare and perform demographic, benchmarking, advertising, marketing, and promotional studies.

To distribute advertisements: AncestryDNA strives to show relevant advertisements. To that end, AncestryDNA may use the information you provide to us, as well as any analyses we perform, aggregated demographic information (such as women between the ages of 45-60), anonymized data compared to data from third parties, or the placement of cookies and other tracking technologies… In these ways, AncestryDNA can display relevant ads on the AncestryDNA Website, third party websites, or elsewhere.

The privacy policy gives Ancestry permission to use its users’ genetic information for advertising purposes. When I inquired with Ancestry, they pointed to the following part of their privacy policy:

We do not provide advertisers with access to individual account information. AncestryDNA does not sell, rent or otherwise distribute the personal information you provide us to these advertisers unless you have given us your consent to do so.

However, it is not clear how your personal information can be used to display “relevant ads” unless either Ancestry operates as an ad network itself or Ancestry communicates some personal information to third party advertisers in order to target the ads. Below, I expand on concerns raised by this privacy policy:

Users may “consent” to the use of their genetic data unknowingly. The privacy policy says Ancestry can distribute users’ private information if Ancestry gets permission first. That permission could be granted by a dialog that users click through without much thought. Research has shown that users are already desensitized to privacy and security warnings.

Even if only Ancestry is using the personal information to target ads, the data might accidentally find its way to third parties. Researchers have demonstrated how it can be difficult to avoid information leakage through URLs or cookies or more sophisticated attacks. If Ancestry categorizes its users according to their genetic traits and then stores and transfers these categories in cookies and URL parameters (a common practice for the analogous “behavioral segment” categories used for many targeted ads), then the genetic data can easily leak to third parties.

The genetic data collected by these services may endanger the privacy of users and their families. A genome is not something easily made unlinkable. Only 33 bits of entropy are necessary to uniquely identify a person. The DNA profiles used by law enforcement in the US today take samples from 13 location on the genome, and have about 54 bits of entropy. The test that Ancestry uses samples 700,000 locations on the genome, which will likely have much more than 33 bits of entropy. In fact, I believe this is enough entropy to compromise not only an individual’s privacy, but also the privacy of family members. With the 13 CODIS locations, law enforcement can already do familial searches for close family members. I hope to touch on the familial aspects of DNA privacy at a later date. The compromise of familial privacy is in part what makes collecting and distributing DNA even more sensitive that just collecting an individual’s full name or address.

Genetic data can be used to discriminate against people on the basis of characteristics they cannot control. More than identity, DNA data may allow someone to infer behavior and health attributes. Major concerns about the impact of genetic information on employment and health insurance led Congress to pass the Genetic Information Nondiscrimination Act, which makes it illegal to use genetics to decide hiring or health insurance pricing. However, GINA may not effectively deter people who 1) are not employers or insurers (e.g., landlords discriminating in their choice of tenants, which is prohibited by California state law but not by the federal provisions in GINA); 2) do not believe they will be caught; or 3) are not aware that they are discriminating, as discussed next.

Unintentional discrimination may occur. The big data report from the White House warns that the “increasing use of algorithms to make eligibility decisions must be carefully monitored for potential discriminatory outcomes for disadvantaged groups, even absent discriminatory intent.” An algorithm that takes genetic information as an input likely will lead to results that differ based on genes. This outcome already discriminates on the basis of genetics, and because genes are correlated with other sensitive attributes, it can also discriminate on the basis of characteristics such as race or health status. The discrimination occurs whether or not the algorithm’s user intended it.