June 29, 2022

Archives for May 2005

Dissecting the Witty Worm

A clever new paper by Abhishek Kumar, Vern Paxson, and Nick Weaver analyzes the Witty Worm, which infected computers running certain security software in March 2004. By analyzing the spray of random packets Witty sent around the Internet, they managed to learn a huge amount about Witty’s spread, including exactly where the virus was injected into the net, and which sites might have been targeted specially by the attacker. They can track with some precision exactly who infected whom as Witty spread.

They did this using data from a “network telescope”. The “telescope” sits in a dark region of the Internet: a region containing 0.4% of all valid IP addresses but no real computers. The telescope records every single network packet that shows up in this dark region. Since there are no ordinary computers in the region, any packets that do show up must be sent in error, or sent to a randomly chosen address.

Witty, like many worms, spread by sending copies of itself to randomly chosen IP addresses. An infected machine would send 20,000 copies of the worm to random addresses, then do a little damage to the local machine, then send 20,000 more copies of the worm to random addresses, then do more local damage, and so on, until the local machine died. When one of the random packets happened to arrive at a vulnerable machine, that machine would be infected and would join the zombie army pumping out infection packets.

Whenever an infected machine happened to send an infection packet into the telescope’s space, the telescope would record the packet and the researchers could deduce that the source machine was infected. So they could figure out which machines were infected, and when each infection began.

Even better, they realized that infected machines were generating the sequence of “random” addresses to attack using something called a Linear Congruential PseudoRandom Number Generator, which is a special kind of deterministic procedure that is sometimes used to crank out a sequence of numbers that looks random, but isn’t really random in the sense that coin-flips are random. Indeed, a LCPRNG has the property that if you can observe its output, then you can predict which “random” numbers it will generate in the future, and you can even calculate which “random” numbers it generated in the past. Now here’s the cool part: the infection packets arriving at the telescope contained “random” addresses that were produced by a LCPRNG, so the researchers could reconstruct the exact state of the LCPRNG on each infected machine. And from that, they could reconstruct the exact sequence of infection attempts that each infected machine made.

Now they knew pretty much everything there was to know about the spread of the Witty worm. They could even reconstruct the detailed history of which machine infected which other machine, and when. This allowed them to trace the infection back to its initial source, “Patient Zero”, which operated from a particular IP address owned by a “European retail ISP”. They observed that Patient Zero did not follow the usual infection rules, meaning that it was running special code designed to launch the worm, apparently by spreading the worm to a “hit list” of machines suspected to be vulnerable. A cluster of machines on the hit list happened to be at a certain U.S. military installation, suggesting that the perpetrator had inside information that machines at that installation would be vulnerable.

The paper goes on from there, using the worm’s spread as a natural experiment on the behavior of the Internet. Researchers often fantasize about doing experiments where they launch packets from thousands of random sites on the Internet and measure the packets’ propagation, to learn about the Internet’s behavior. The worm caused many machines to send packets to the telescope, making it a kind of natural experiment that would have been very difficult to do directly. Lots of useful information about the Internet can be extracted from the infection packets, and the authors proceeded to deduce facts about the local area networks of the infected machines, how many disks they had, the speeds of various network bottlenecks, and even when each infected machine had last been rebooted before catching the worm.

This is not a world-changing paper, but it is a great example of what skilled computer scientists can do with a little bit of data and a lot of ingenuity.

On a New Server

This site is on the new server now, using WordPress. Please let me know, in the comments, if you see any problems.

About This Site

Ed Felten says:

Hi, I’m Ed Felten. In my day job, I’m a Professor of Computer Science and Public Affairs at Princeton University, and Director of Princeton’s Center for InfoTech Policy.

Alex Halderman says:

Hi, I’m J. Alex Halderman. In my afternoon and night job, I’m a graduate student in Computer Science at Princeton University.

Dan Wallach says:

Hi, I’m Dan Wallach. During the day, I’m an associate professor in the department of computer science at Rice University. Back in the day, I got my PhD at Princeton working for Ed. These days, I’m spending most of my time working on electronic voting security.

We, and the other authors listed in the sidebar, write this weblog. The focus is on issues related to legal regulation of technology, and especially on legal attempts to restrict the right of technologists and citizens to tinker with technological devices. But we reserve the right to write about anything that strikes our fancy.

Needless to say, we speak only for ourselves. Nothing we write here is endorsed by our employers, our fellow contributors on this blog, or by anyone else except the author. Even we are not too sure about some of this stuff. Posts by others, including our fellow bloggers, guest bloggers and other contributors, reflect their opinions, not necessarily ours.

We welcome comments, suggestions, and polite argumentation. If you send us an email about something we’ve written here, we’ll assume (unless you tell us otherwise) that we have your permission to quote your message on the site. Or you can post a comment to the site yourself.

Material in the Comments section is contributed by others. We can’t vouch for its accuracy and it doesn’t necessarily reflect our opinions. We reserve the right to remove comments that are clearly off-topic or highly offensive; but otherwise we’ll leave the comments alone.

(We also use automated tools to fight comment spam. When these tools see indications of spamminess in a comment – according to whatever criteria the tools’ authors chose to use – they will remove a comment or hold it for human inspection. We look at the held comments periodically and release any that are not spam. If your comments seem to disappear or be mysteriously delayed for hours, this is probably the explanation. We apologize for any inconvenience, but we have found automated anti-spam tools necessary given the volume of comment spam we face.)

Unless noted otherwise, the author of each post owns the copyright on that post. (Commenters may own the copyright on their comments – ask a copyright lawyer – but we assume that commenters give our readers permission to redistribute or use their comments under the same terms that apply to our material on which they are commenting.) Everything else that is copyrightable is copyrighted by Edward W. Felten, J. Alex Halderman, and Dan S. Wallach. Thanks to the Sonny Bono Copyright Term Extension Act of 1998, our copyrights on this site will expire early in the 22nd century.

Creative Commons License
Unless noted otherwise, material on Freedom to Tinker is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.