June 27, 2017

Archives for May 2011

Overstock's $1M Challenge

As reported in Fast Company, RichRelevance and Overstock.com teamed up to offer up to a $1,000,000 prize for improving “its recommendation engine by 10 percent or more.”

If You Liked Netflix, You Might Also Like Overstock
When I first read a summary of this contest, it appeared they were following in Netflix’s footsteps right down to releasing user data sans names. This did not end well for Netflix’s users or for Netflix. Narayanan and Shmatikov were able to re-identify Netflix users using the contest dataset, and their research contributed greatly to Ohm’s work on de-anonimization. After running the contest a second time, Netflix terminated it early in the face of FTC attention and a lawsuit that they settled out of court.

This time, Overstock is providing “synthetic data” to contest entrants, then testing submitted algorithms against unreleased real data. Tag line: “If you can’t bring the data to the code, bring the code to the data.” Hmm. An interesting idea, but short on a few details around the sharp edges that jump out as highest concern. I look forward to getting the time to play with the system and dataset. What is good news is seeing companies recognize privacy concerns and respond with something interesting and new. That is, at least, a move in the right direction.

Place your bets now on which happens first: a contest winner with a 10% boost to sales, or researchers finding ways to re-identify at least 10% of the data?

Debugging Legislation: PROTECT IP

There’s more than a hint of theatrics in the draft PROTECT IP bill (pdf, via dontcensortheinternet ) that has emerged as son-of-COICA, starting with the ungainly acronym of a name. Given its roots in the entertainment industry, that low drama comes as no surprise. Each section name is worse than the last: “Eliminating the Financial Incentive to Steal Intellectual Property Online” (Sec. 4) gives way to “Voluntary action for Taking Action Against Websites Stealing American Intellectual Property” (Sec. 5).

Techdirt gives a good overview of the bill, so I’ll just pick some details:

  • Infringing activities. In defining “infringing activities,” the draft explicitly includes circumvention devices (“offering goods or services in violation of section 1201 of title 17”), as well as copyright infringement and trademark counterfeiting. Yet that definition also brackets the possibility of “no [substantial/significant] use other than ….” Substantial could incorporate the “merely capable of substantial non-infringing use” test of Betamax.
  • Blocking non-domestic sites. Sec. 3 gives the Attorney General a right of action over “nondomestic domain names”, including the right to demand remedies from (A) domain name system server operators, (B) financial transaction providers, (C), Internet advertising services, and (D) “an interactive computer service (def. from 230(f)) shall take technically feasible and reasonable measures … to remove or disable access to the Internet site associated with the domain name set forth in the order, or a hypertext link to such Internet site.”
  • Private right of action. Sec. 3 and Sec. 4 appear to be near duplicates (I say appear, because unlike computer code, we don’t have a macro function to replace the plaintiff, so the whole text is repeated with no diff), replacing nondomestic domain with “domain” and permitting private plaintiffs — “a holder of an intellectual property right harmed by the activities of an Internet site dedicated to infringing activities occurring on that Internet site.” Oddly, the statute doesn’t say the simpler “one whose rights are infringed,” so the definition must be broader. Could a movie studio claim to be hurt by the infringement of others’ rights, or MPAA enforce on behalf of all its members? Sec. 4 is missing (d)(2)(D)
  • WHOIS. The “applicable publicly accessible database of registrations” gets a new role as source of notice for the domain registrant, “to the extent such addresses are reasonably available.” (c)(1)
  • Remedies. The bill specifies injunctive relief only, not money damages, but threat of an injunction can be backed by the unspecified threat of contempt for violating one.
  • Voluntary action. Finally the bill leaves room for “voluntary action” by financial transaction providers and advertising services, immunizing them from liability to anyone if they choose to stop providing service, notwithstanding any agreements to the contrary. This provision jeopardizes the security of online businesses, making them unable to contract for financial services against the possibility that someone will wrongly accuse them of infringement. 5(a) We’ve already seen that it takes little to convince service providers to kick users off, in the face of pressure short of full legal process (see everyone vs Wikileaks, Facebook booting activists, and numerous misfired DMCA takedowns); this provision insulates that insecurity further.

In short, rather than “protecting” intellectual and creative industry, this bill would make it less secure, giving the U.S. a competitive disadvantage in online business. (Sorry, Harlan, that we still can’t debug the US Code as true code.)

Don't love the cyber bomb, but don't ignore it either

Cybersecurity is overblown – or not

A recent report by Jerry Brito and Tate Watkins of George Mason University titled “Loving The Cyber Bomb? The Dangers Of Threat Inflation In Cybersecurity Policy” has gotten a bit of press. This is an important topic worthy of debate, but I believe their conclusions are incorrect. In this posting, I’ll summarize their report and explain why I think they’re wrong.

Brito & Watkins (henceforth B&W) argue that the cyber threat is exaggerated, and its being driven by private industry anxious to feed at the public trough in a manner similar to the creation of the military industrial complex in the second half of the 20th century as an outgrowth of the Cold War.

The paper starts by describing how deliberate misinformation in the run-up to the Iraq war is an example of how public opinion can be manipulated by policy makers and private industry trying to sell a threat. My opinion of the Iraq war is not relevant to this discussion, but I believe they’re using to create a strawman which they then knock down.

Next, B&W they use the CSIS Commission Report on Cybersecurity for the 44th Presidency and Richard Clarke’s “Cyber War” to argue that the threat of cyber conflict has been overblown. With regard to the former, they criticize the confusion of probes (port scans) with real attacks, and argue that probes are not evidence of an attack or breach but more akin to doorknob rattling. While that’s certainly true (and an analogy that’s been made for years), if your doorknob is rattled thousands of times a day it’s a strong indication that you’re living in a bad neighborhood! They then note that there’s little unclassified proof of real threats, and hence the call for regulation by CSIS (and others) is inappropriate. Unfortunately, quantitative proof is hard to come by, but there are enough incidents that there can be little doubt as to the severity of the threat. Requiring quantitative data before we move to protection would be akin to demanding an open and accurate assessment of the number of foreign spies and the damage they do before we fund the CIA! Instead, we rely on experts in spycraft to assess the threat, and help define appropriate defenses. In the same way, we should rely on cybersecurity experts to provide an assessment of the risks and appropriate actions. I certainly agree with both CSIS and B&W that overclassification of the threats works to our detriment – if the public is unable to see the threat, it becomes hard to justify spending to defend against it. I’ve personally seen this in the commercial software industry, where the inability to provide hard data about cyber threats to senior management results in that threat being discounted, with consequent risk to businesses. But again, the problems with overclassification do not mean the problem doesn’t exist.

Regarding Clarke’s book, there’s been plenty of criticism of both technical inaccuracies and the somewhat hysterical tone. Those notwithstanding, Clarke generally has a good understanding of the types of threats and the risks. B&W’s claim that the only verifiable attacks are DDOS is simply untrue – there have been verified attacks against infrastructure like water systems, although some of the claimed attacks are other types of failures that could have been cyber-related, but aren’t. As an example, while Clarke claims that the northeast power blackout of 2003 was cyber-related, there’s adequate evidence that it was not – but there’s also adequate evidence that such an accidental failure could be caused by a deliberate attack. Similarly, the NYSE “flash crash” was not caused by a cyber attack, but demonstrates the fragility of modern highly computerized systems, and shows that a cyber attack could cause similar symptoms. That which can happen by accident can also happen intentionally, if an adversary desires.

As for B&W’s analogy to the military industrial complex that President Eisenhower so famously feared, and the increasing influence of cyberpork, I must reluctantly agree. Large defense contractors have, in recent years, flocked to cyber as it has become trendy and large budgets have become attractive, frequently more concerned with revenue than with solving problems. However, the problems existed (and were being discussed) by researchers and practitioners long before the influx of government contractors. The fact that they’re trying to make money off the problem doesn’t mean the problem doesn’t exist.

The final section of the paper, covering regulatory issues, has some good points, but it is so poisoned by the assumptions in the earlier sections of the paper that it’s hard to take seriously.

To summarize, we should distinguish between the existence of the problem (which is real and growing) versus the desire of some government contractors to cash in – the fact that the latter is occurring does not deny the reality of the former.