June 25, 2018

Archives for March 2006

NYU/Princeton Spyware Workshop Liveblog

Today I’m at the NYU/Princeton spyware workshop. I’ll be liveblogging the workshop here. I won’t give you copious notes on what each speaker says, just a list of things that strike me as interesting. Videos of the presentations will be available on the net eventually.

I gave a basic tutorial on spyware last night, to kick off the workshop.

The first panel today is officially about the nature of the spyware problem, but it’s shaping up as the law enforcement panel. The first speaker is Mark Eckenwiler from the U.S. Department of Justice. He is summarizing the various Federal statutes that can be used against spyware purveyors, including statutes against wiretapping and computer intrusions. One issue I hadn’t heard before involves how to prove that a particular spyware purveyor caused harm, if the victim’s computer was also infected with lots of other spyware from other sources.

Second speaker is Eileen Harrington of the Federal Trade Commission. The FTC has two main roles here: to enforce laws, especially relating to unfair and deceptive business practices, and to run hearings and study issues. In 1995 the FTC ran a series of hearing on online consumer protection, which identified privacy as important but didn’t identify spam or spyware. In recent years their focus has shifted more toward spyware. FTC enforcement is based on three principles: the computer belongs to the consumer; disclosure can’t be buried in a EULA; and software must be reasonably removable. These seem sensible to me. She recommends a consumer education website created by the FTC and other government agencies.

Third speaker is Justin Brookman of the New York Attorney General’s office. To them, consent is the biggest issue. He is skeptical of state spyware laws, saying they are often too narrow and require high level of intent to be proven for civil liability. Instead, they enforce based on laws against deceptive business practices and false advertising, and on trespass to chattels. They focus on the consumer experience, and don’t always need to dig very deeply into all of the technical details. He says music lyric sites are often spyware-laden. In one case, a screen saver came with a 188-page EULA, which mentioned the included adware on page 131. He raises the issue of when companies are responsible for what their “affiliates” do.

Final speaker of the first panel is Ari Schwartz of CDT, who runs the Anti-Spyware Coalition. ASC is a big coalition of public-interest groups, companies, and others to build consensus around a definition of spyware and principles for dealing with it. The definition problem is both harder and more important than you might think. The goal was to create a broadly accepted definition, to short-circuit debates about whether particular pieces of unpleasant software are or are not spyware. He says that many of the harms caused by software are well addressed by existing law (identity theft, extortion, corporate espionage, etc.), but general privacy invasions are not. In what looks like a recurring theme for the workshop, he talks about how spyware purveyors use intermediaries (“affiliates”) to create plausible deniability. He shows a hair-raising chain of emails obtained in discovery in an FTC case against Sanford Wallace and associates. This was apparently an extortion-type scheme, where extreme spyware was locked on to a user’s computer, and the antidote was sold to users for $30.

Question to the panel about what happens if the perpetrator is overseas. Eileen Harrington says that if there are money flows, they can freeze assets or sometimes get money repatriated for overseas. The FTC wants statutory changes to foster information exchange with other governments. Ari Schwartz says advertisers, ad agencies, and adware makers are mostly in the U.S. Distribution of software is sometimes from the U.S., sometimes from Eastern Europe, former Soviet Union, or Asia.

Q&A discussion of how spyware programs attack each other. Justin Brookman talks about a case where one spyware company sued another spyware company over this.

The second panel is on “motives, incentives, and causes”. It’s two engineers and two lawyers. First is Eric Allred, an engineer from Microsoft’s antispyware group. “Why is this going on? For the money.”

Eric talks about game programs that use spyware tactics to fight cheating code, e.g. the “warden” in World of Warcraft. He talks about products that check quality of service or performance provided by, e.g., network software, by tracking some behaviors. He thinks this is okay with adequate notice and consent.

He takes a poll of the room. Only a few people admit to having their machines infected by spyware – I’ll bet people are underreporting. Most people say that friends have caught spyware.

Second speaker is Markus Jakobsson, an engineer from Indiana University and RavenWhite. He is interested in phishing and pharming, and the means by which sites can gather information about you. As a demonstration, he says his home page tells you where you do your online banking.

He describes an experiment they did that simulated phishing against IU students. Lots of people fell for it. Interestingly, people with political views on the far left or far right were more likely to fall for it than people with more moderate views. The experimental subjects were really mad (but the experiment had proper institutional review board approval).

“My conclusion is that user education does not work.”

Third is Paul Ohm, a law professor at Colorado. He was previously a prosecutor at the DOJ. He talks about the “myth of the superuser”. (I would have said “superattacker”.) He argues that Internet crime policy is wrongly aimed to stop the superuser.

What happens? Congress writes prohibitions that are broad and vague. Prosecutors and civil litigants use the broad language to pursue novel theories. Innocent people get swept in.

He conjectures that most spyware purveyors aren’t technological superuser. In general, he argues that legislation should focus on non-superuser methods and harms.

He talks about the SPYBLOCK Act language, which bans certain actions, if done with certain bad intent. “The FBI agent stops reading after the list of actions.”

Fourth is Marc Rotenberg from EPIC. His talk is structured as a list of observations, presented in random order. I’ll repeat some of them here. (1) People tend to behave opportunistically online – extract information if you can. (2) “Spyware is a crime of architectural opportunity.” (3) Motivations for spyware: money, control, exploitation, investigation.

He argues that cookies are spyware. This is a controversial view. He argues for reimagining cookies or how users can control them.

Q&A session begins. Alex asks Paul Ohm whether it makes sense in the long run to focus on attackers who aren’t super, given that attackers can adapt. Paul says, first, that he hopes technologists will help stop the superattackers. (The myth of the super-defender?) He advocates a more incremental and adaptive approach to drafting the statutes; aim at the 80% case, then adjust every few years.

Question to Marc Rotenberg about what can be done about cookies. Marc says that originally cookies contained, legibly, the information they represented, such as your zip code. But before long cookies morphed into unique identifiers, opaque to the user. Eric Allred points out that the cookies can be strongly, cryptographically opaque to users.

The final session is on solutions. Ben Edelman speaks first. He shows a series of examples of unsavory practices, relating to installation without full consent and to revenue sources for adware.

He shows a scenario where a NetFlix popup ad appears when a user visits blockbuster.com. This happened through a series of intermediaries – seven HTTP redirects – to pop up the ad. Netflix paid LinkShare, LinkShare paid Azoogle, Azoogle paid MyGeek, and MyGeek paid DirectRevenue. He’s got lots of examples like this, from different mainstream ad services.

He shows an example of Google AdSense ads popping up in 180solutions adware popup windows. He says he found 4600+ URLs where this happened (as of last June).

Orin Kerr speaks next. “The purpose of my talk is to suggest that there are no good ways for the law to handle the spyware problem.” He suggests that technical solutions are a better idea. A pattern today: lawyers want to rely more on technical solutions, technologists want to rely more on law.

He says criminal law works best when the person being prosecuted is clearly evil, even to a juror who doesn’t understand much about what happened. He says that spyware purveyors more often operate in a hazy gray area – so criminal prosecution doesn’t look like the right tool.

He says civil suits by private parties may not work, because defendants don’t have deep enough pockets to make serious suits worthwhile.

He says civil suits by government (e.g., the FTC) may not work, because they have weaker investigative powers than criminal investigators, especially against fly-by-night companies.

It seems to me that his arguments mostly rely on the shady, elusive nature of spyware companies. Civil actions may work against large companies that portray themselves as legitimate. So they may have the benefit of driving spyware vendors underground, which could make it harder for them to sell to some advertisers.

Ira Rubinstein of Microsoft is next. His title is “Code Signing As a Spyware Solution”. He describes (the 64-bit version of) Windows Vista, which will require any kernel-mode software to be digitally signed. This is aimed to stop rootkits and other kernel-mode exploits. It sounds quite similar to AuthentiCode, Microsoft’s longstanding signing infrastructure for ActiveX controls.

Mark Miller of HP is the last speaker. His talk starts with an End-User Listening Agreement, in which everyone in the audience must agree that he can read our minds and redistribute what he learns. He says that we’re not concerned about this because it’s infeasible for him to install hostile code into our brains.

He points out that the Solitaire program has the power to read, analyze or transmit any data on the computer. Any other program can do the same. He argues that we need to obey the principle of least privilege. It seems to me that we already have all the tools to do this, but people don’t do it.

He shows an example of how to stop a browser from leaking your secrets, by either not letting it connect to the Net, or not letting it read any local files. But even a simple browser needs to do both. This is not a convincing demo.

In the Q&A, Ben Edelman recommend’s Eric Howes’s web site as a list of which antispyware tools are legit and which are bogus or dangerous.

Orin Kerr is asked whether we should just give up on using the law. He says no, we should use the law to chip away at the problem, but we shouldn’t expect it to solve the problem entirely. Justin Brookman challenges Orin, saying that civil subpoenia power seems to work for Justin’s group at the NY AG office. Orin backtracks slightly but sticks to his basic point that spyware vendors will adapt or evolve into forms more resistant to enforcement.

Alex asks Orin how law and technology might work together to attack the problem. Orin says he doesn’t see a grand solution, just incremental chipping away at the problem. Ira Rubinstein says that law can adjust incentives, to foster adoption of better technology approaches.

And our day draws to a close. All in all, it was a very interesting and thought-provoking discussion. I wish it had been longer – which I rarely say at the end of this kind of event.

RFID Virus Predicted

Melanie Rieback, Bruno Crispo, and Andy Tanenbaum have a new paper describing how RFID tags might be used to propagate computer viruses. This has garnered press coverage, including a John Markoff story in today’s New York Times.

The underlying technical argument is pretty simple. An RFID tag is a tiny device, often affixed to a product of some sort, that carries a relatively small amount of data. An RFID reader is a larger device, often stationary, that can use radio signals to read and/or modify the contents of RFID tags. In a retail application, a store might affix an RFID tag to each item in stock, and have an RFID reader at each checkout stand. A customer could wheel a shopping cart full of items up to the checkout stand, and the RFID reader would determine which items were in the cart and would charge the customer and adjust the store’s inventory database accordingly.

Simple RFID tags are quite simple and only carry data that can be read or modified by readers. Tags cannot themselves be infected by viruses. But they can act as carriers, as I’ll describe below.

RFID readers, on the other hand, are often quite complicated and interact with networked databases. In our retail example, each RFID reader can connect to the store’s backend databases, in order to update the store’s inventory records. If RFID readers run complicated software, then they will inevitably have bugs.

One common class of bugs involves bad handling of unexpected or diabolical input values. For example, web browsers have had bugs in their URL-handling code, which caused the browsers to either crash or be hijacked when they encountered diabolically constructed URLs. When such a bug existed, an attacker who could present an evil URL to the browser (for example, by getting the user to navigate to it) could seize control of the browser.

Suppose that some subset of the world’s RFID readers had an input-processing bug of this general type, so that whenever one of these readers scanned an RFID tag containing diabolically constructed input, the reader would be hijacked and would execute some command contained in that input. If this were the case, an RFID-carried virus would be possible.

A virus attack might start with a single RFID tag carrying evil data. When a vulnerable reader scanned that tag, the reader’s bug would be triggered, causing the reader to execute a command specified by that tag. The command would reconfigure the reader to make it write copies of the evil data onto tags that it saw in the future. This would spread the evil data onto more tags. When any of those tags came in contact with a vulnerable reader, that reader would be infected, turning it into a factory for making more infected tags. The infection would spread from readers to new tags, and from tags to new readers. Before long many tags and readers would be infected.

To demonstrate the plausibility of this scenario, the researchers wrote their own RFID reader, giving it a common type of bug called an SQL injection vulnerability. They then constructed the precise diabolical data needed to exploit that vulnerability, and demonstrated that it would spread automatically as described. In light of this demo, it’s clear that RFID viruses can exist, if RFID readers have certain types of bugs.

Do such bugs exist in real RFID readers? We don’t know – the researchers don’t point to any – but it is at least plausible that such bugs will exist. Our experience with Web and Internet software is not encouraging in this regard. Bugs can be avoided by very careful engineering. But will engineers be so careful? Not always. We don’t know how common RFID viruses will be, but it seems likely they will exist in the wild, eventually.

Designers of RFID-based systems will have to engineer their systems much more carefully than we had previously thought necessary.

Discrimination, Congestion, and Cooperation

I’ve been writing lately about the nuts and bolts of network discrimination. Today I want to continue that discussion by talking about how the Internet responds to congestion, and how network discrimination might affect that response. As usual, I’ll simplify the story a bit to spare you a lengthy dissertation on network management, but I won’t mislead you about the fundamental issues.

I described previously how network congestion causes Internet routers to discard some data packets. When data packets arrive at a router faster than the router can forward those packets along the appropriate outgoing links, the packets will pile up in the router’s memory. Eventually the memory will fill up, and the router will have to drop (i.e., discard) some packets.

Every dropped packet has some computer at the edge of the network waiting for it. Eventually the waiting computer and its communication partner will figure out that the packet must have been dropped. From this, they will deduce that the network is congested. So they will re-send the dropped packet, but in response to the probable congestion they will slow down the rate at which they transmit data. Once enough packets are dropped, and enough computers slow down their packet transmission, the congestion will clear up.

This is a very indirect way of coping with congestion – drop packets, wait for endpoint computers to notice the missing packets and respond by slowing down – but it works pretty well.

One interesting aspect of this system is that it is voluntary – the system relies on endpoint computers to slow down when they see congestion, but nothing forces them to do so. We can think of this as a kind of deal between endpoint computers, in which each one promises to slow down if its packets are dropped.

But there is an incentive to defect from this deal. Suppose that you defect – when your packets are dropped you keep on sending packets as fast as you can – but everybody else keeps the deal. When your packets are dropped, the congestion will continue. Then other people’s packets will be dropped, until enough of them slow down and the congestion eases. By ignoring the congestion signals you are getting more than your fair share of the network.

Despite the incentive to defect, most people keep the deal by using networking software that slows down as expected in response to congestion. Why is this? We could debate the reasons, but it seems safe to say that there is a sort of social contract by which users cooperate with their peers, and software vendors cooperate by writing software that causes users to keep the deal.

One of the reasons users comply, I think, is a sense of fairness. If I believe that the burdens of congestion control fall pretty equally on everybody, at least in the long run, then it seems fair to me to slow down my own transmissions when my turn comes. One time I might be the one whose packets get dropped, so I will slow down. Another time, by chance, somebody else’s packets may be dropped, so it will be their turn to slow down. Everybody gets their turn.

(Note: I’m not claiming that the average user has thought through these issues carefully. But many software providers have made decisions about what to do, and those decisions factor in users’ wants and needs. Software developers act as proxies for users in making these decisions. Perhaps this point will get more discussion in the comments.)

But now suppose that the network starts singling out some people and dropping their packets first. Now the burden of congestion control falls heavily on them – they have to slow down and others can just keep going. Suddenly the I’ll-slow-down-if-you-do deal doesn’t seem so fair, and the designated victims are more likely to defect from the deal and just keep sending data even when the network tells them to slow down.

The implications for network discrimination should now be pretty clear. If the network discriminates by sending misleading signals about congestion, and sending them preferentially to certain machines or certain applications, the incentive for those machines and applications to stick to the social contract and do their share to control congestion, will weaken. Will this lead to a wave of defections that destroys the Net? Probably not, but I can’t be sure. I do think this is something we should think about.

We should also listen to the broader lesson of this analysis. If the network discriminates, users and applications will react by changing their behavior. Discrimination will have secondary effects, and we had better think carefully about what they will be.

[Note for networking geeks: Yes, I know about RED, and congestion signaling, non-TCP protocols that don’t do backoff, and so on. I hope you’ll agree that despite all of those real-world complications, the basic argument in this post is valid.]