April 24, 2024

Overstock's $1M Challenge

As reported in Fast Company, RichRelevance and Overstock.com teamed up to offer up to a $1,000,000 prize for improving “its recommendation engine by 10 percent or more.”

If You Liked Netflix, You Might Also Like Overstock
When I first read a summary of this contest, it appeared they were following in Netflix’s footsteps right down to releasing user data sans names. This did not end well for Netflix’s users or for Netflix. Narayanan and Shmatikov were able to re-identify Netflix users using the contest dataset, and their research contributed greatly to Ohm’s work on de-anonimization. After running the contest a second time, Netflix terminated it early in the face of FTC attention and a lawsuit that they settled out of court.

This time, Overstock is providing “synthetic data” to contest entrants, then testing submitted algorithms against unreleased real data. Tag line: “If you can’t bring the data to the code, bring the code to the data.” Hmm. An interesting idea, but short on a few details around the sharp edges that jump out as highest concern. I look forward to getting the time to play with the system and dataset. What is good news is seeing companies recognize privacy concerns and respond with something interesting and new. That is, at least, a move in the right direction.

Place your bets now on which happens first: a contest winner with a 10% boost to sales, or researchers finding ways to re-identify at least 10% of the data?

California to Consider Do Not Track Legislation

This afternoon the CA Senate Judiciary Committee had a brief time for proponents and opponents of SB 761 to speak about CA’s Do Not Track legislation. In general, the usual people said the usual things, with a few surprises along the way.

Surprise 1: repeated discussion of privacy as a Constitutional right. For those of us accustomed to privacy at the federal level, it was a good reminder that CA is a little different.

Surprise 2: TechNet compared limits on Internet tracking to Texas banning oil drilling, and claimed DNT is “not necessary” so legislation would be “particularly bad.” Is Kleiner still heavily involved in the post-Wade TechNet?

Surprise 3: the Chamber of Commerce estimated that DNT legislation would cost $4 billion dollars in California, extrapolated from an MIT/Toronto study in the EU. Presumably they mean Goldfarb & Tucker’s Privacy Regulation and Online Advertising, which is in my queue to read. Comments on donottrack.us raise concerns. Assuming even a generous opt-out rate of 5% of CA Internet users, $4B sounds high based on other estimates of value of entire clickstream data for $5/month. I look forward to reading their paper, and to learning the Chamber’s methods of estimating CA based on Europe.

Surprise 4: hearing about the problems of a chilling effect — for job growth, not for online use due to privacy concerns. Similarly, hearing frustrations about a text that says something “might” or “may” happen, with no idea what will actually transpire — about the text of the bill, not about the text of privacy policies.

On a 3 to 2 vote, they sent the bill to the next phase: the Appropriations Committee. Today’s vote was an interesting start.