December 21, 2024

A Freedom-of-Speech-based Approach To Limiting Filesharing – Part II: The Block List

On Wednesday we discussed the open structure of filesharing and its resulting vulnerability to spam. While there are some similarities between e-mail and gnutella spam, the spoof files have no analogue in e-mail. When MediaDefender puts up spoofs for Rihanna’s Disturbia, unless you are using gnutella to search for Disturbia – which you cannot legally do – the spam has no effect on you. But of course, if MediaDefender is allowed to persist in doing this successfully, gnutella would lose much of its appeal.

The solution that has traditionally been adopted is an IP block list. When MediaDefender puts up spoof files, they come from the IP addresses of MediaDefender’s computers. While it is possible that MediaDefender could (and doubtless would have to) get several computers to perform the spoofing, they are all accessing the internet through a single ISP. Therefore, when an ISP is found to be hosting a spoofing operation such as MediaDefender’s, the entire range of IP addresses owned by the ISP is added to filesharing program’s IP block list. When an IP address is on the block list, other computers will refuse to connect to it, thereby preventing it from filesharing.

Because filesharing becomes useless without something to stop spoof files, IP block lists are a common part of P2P sharing programs. Generally, they are posted on web sites and downloaded by the P2P program, at the direction of the user. The program is generally configurable to download the block list from a site of the user’s choosing, and the block list file is stored in a known location and is readable and editable by interested users. For example, this forum discussion describes how to download the block file for the P2P client eMule.

What is not broadly appreciated is the role that LimeWire the corporation plays in the gnutella network. LimeWire is not merely a provider of software (and there are non-LimeWire gnutella clients, not as popular as LimeWire). Limewire’s client software, aside from supporting the gnutella protocol, receives from LimeWire a cryptographically signed file, called simpp.xml. This file contains a number of parameters for the operation of the client, including its IP block list. Because of the strong cryptographic signing by LimeWire corporation, no one else may send the list. LimeWire can therefore, at its sole discretion, block hosts from sending data to essentially all of its clients. Anyone putting up files that LimeWire deems unsuitable is knocked off in a matter of hours, and, since LimeWire is by far the most popular gnutella client, the spoofer is effectively shut down.

The LimeWire P2P clients are unusual in that there is nothing configurable about the choice of block list. Moreover, unlike other programs, there is no way for anyone other than LimeWire to send it, and no way for a non-technical user to examine its contents – in fact, the typical non-technical user would not even know that blocking is going on. (The only way to turn off blocking is on an advanced configuration panel.)

(One other interesting feature is also revealed from looking at the simpp.xml file: LimeWire has added a facility that allows its server, and only its server, to contact a running LimeWire client and ask it various questions about what the client is doing. This feature allows LimeWire to phone up LimeWire clients and inspect them, thereby gathering information about its network. This feature could be used as a sort of mini-spyware, though it is not clear exactly what LimeWire does with it.)

Tomorrow we shall see one way to interpret the legal significance of these behaviors on LimeWire corporation’s part.

Comments

  1. I would say that downloading copyrighted content should be considered illegal.
    Peter from senuke guide.

  2. IP block list is a great solution, but how can they block millions of IPs?
    DD

  3. you might answer the question as to why a legal service would need to use filesharing at all, since if you are paying for content there is enough revenue to cover bandwidth costs. That is how iTunes works and it works fine.I agree with you said. tiffany

    • Anonymous says

      Bandwith is not cheap, electricity cost and computer cost for a datacenter are also quite high and if you must pay for something it must work, if Itunes has a problem and the download aborts you shouldn’t lose the file you paid for. if Itunes is down for a day it can have repercussions far more severe than it does for a free service. also credit card companies tend to charge an arm and a leg for each purchase, this means that running the pay barrier itself is also expensive (Itunes has a special deal, otherwise it would be losing money on each purchase.)

      However the bulk of each Itunes purchase goes to the Labels, which don’t do much that is useful.

      A P2P system however wouldn’t have most of those cost, though most still pay for a datacenter and the associated cost it is possible to make the system entirely distributed.

  4. IP block lists are notoriously ineffective (and overbroad) mechanisms for controlling spam, and there’s no reason to think they will be any better where P2P spoofing is concerned. (After all, last I checked, it was pretty easy to spot spoofs in search results thanks to the large numbers of servers and identical copies that spoofers used. And, of course, spoofing costs money, which means the labels tend to do it only for a tiny fraction of the catalog.)

    Real resistance to spoofs comes from having a hash that users are willing to vouch for. And there is no reason that this mechanism needs to be built into a file sharing application, rather than provided by independent third parties (see, e.g., the metadata collected at MusicBrainz). That’s part of the reason Bit Torrent has been so successful — the hash checking plus community vouching makes it very hard to spoof.

    I think you can expect more developments in this direction, rather than in IP block lists.

    • Controlling e-mail spam is rather different from controlling P2P spoof files, because they are done by different parties and in different ways. As I indicated in the last post, they have different goals.

      IP block lists did work for e-mail spam back in the early days, when most spam originated from open relays. In response, spammers created botnets via infecting computers with viruses. The spam was thereby distributed, defeating the block lists. Mediadefender does not do this sort of thing with its spoofs, which is why the P2P block list is so effective against it.

      Overbroadness doesn’t matter from LimeWire’s point of view. The are happy to throw ISPs off, especially small ones that host only commercial clients. The real users are not on those sorts of ISPs, they’re at college campuses and residences. LimeWire is happy to lose 1% of its users if that is what it takes to keep the spoof files off.

      Community vouching does work, so long as it is not under sustained attack from cheaters of various sorts. I think it would be an interesting research problem to build an open, scalable voting system that would be immune from botnets or other forms of ballot box stuffers. I tend to be pessimistic about it, but I’d be happy to be proven wrong. At any rate, it isn’t what is done now by LimeWire.

      • Anonymous says

        Past tense ought to be used when discussing MediaDefender

        • They are still on LimeWire’s and other’s block lists so their service wouldn’t be too effective. However, as a company they are still in business and their site still lists them as offering the spoofing service.

  5. I almost always find this blog interesting and informative, and this investigation into a particular P2P implementation has already been especially enlightening to me. You’re not even done, and I feel much more informed about this topic. Thank you. I look forward to the rest.

  6. search for Disturbia – which you cannot legally do

    This may be a bit pedantic and IANAL, but I don’t think actually searching for unauthorized content is illegal, it’s what you do after you find it that could be. I question whether any court has found the act of downloading unauthorized content to be illegal either. All the court cases that I have read about have been against people whose computers have been uploading or sharing such content, and while the RIAA usually mention downloading in their complaints it’s the sharing that they’re actually prosecuting. They don’t seem to have been targeting downloaders who don’t share, even though it might be easier for them to find downloaders (by sharing the tracks themselves and recording the IP addresses of everyone who downloads from them). While the default setting for many P2P clients might be to automatically share everything they download, we should be careful to distinguish between the two when discussing the issue.

    • Yes, you’re right – the mere searching is not illegal. Of course, there’s little or no reason to search for something except to download it.

      As to the question of whether downloading content is in and of itself illegal, I can just say this: I am not a lawyer, so take this with a grain of salt, but here’s how I would analyze it. Everyone would certainly agree that when downloading happens, an illegal activity took place. The question is Do you think that the court is likely to hold that 100% of the liability is with the uploader and 0% with the downloader? This seems an unlikely outcome.

      It is true that when you join gnutella you are in a position to be uploading many, many times, and therefore whatever liability you have could be multiplied many times over. That would explain why the court cases inevitably involve people who are uploading.

      • See Is Downloading Illegal?.

        It is possible to download something that does not involve copying a file, e.g. an image from a webcam or Mandelbrot function.

        It is also possible to download GPL software without breaking any law.

        Also consider the curious difference between downloading that may be legally classified as ‘streaming’ and downloading that may be legally classified as ‘receiving delivery of a file’.

        • The original statement in the text was about Rihanna’s song Disturbia. This is most definitely covered by copyright (and is not GPL!)

      • Anonymous says

        this is strongly loation dependent, AFAIK no western court has ever rules downloading a copyrighted file itself as illegal, but the uploading is, and in some juristiction downloading for personal use i perfetly legal, in a few countries uploading for personal use is also ok.
        This has to do with exactly how the law is written and interpreted, but in most cases it’s distribution that is illegal, and downloading is ussualy not seen as distribution.

      • Anonymous says

        A friend of mine, as part of a university course, wrote a bot to search one of the filesharing networks and do some measurements with the results. I have searched for things in filesharing networks with no intention of downloading them, just for the curiosity of seeing if anyone had shared them and what other people had commented on the files. I would not say there is little or no reason for searching with no intention of downloading.

  7. I wouldn’t be so quick to assume that current file-sharing (distributed systems) technologies are the state of the art and therefore to make far reaching deductions from the nature of the current implementations.

    This is because the state of the art that would by now have developed cannot develop in a legal environment that is antipathetic toward it. Who will fund the development of such systems if the developers are liable to prosecution? Many of the design decisions of the current implementations are directed by legal rather than technical considerations.

    A legal file-sharing system that must check licenses or pay fees each time a file is replicated or distributed is akin to a car that must be preceded by a pedestrian carrier of a red flag.

    • Perhaps I am not understanding what you are saying but let me respond thus: my view is that the current structure of gnutella was set up *specifically* to permit the illegal exchange of files while diffusing the legal responsibility. I disagree that a legal filesharing system would be like having a person carrying a flag in front of a car. It could, for example, be implemented as a minor modification of bittorrent. (You pay some amount to get access to the tracker on a central server; the file itself comes from peers not the central server.)

      But you might answer the question as to why a legal service would need to use filesharing at all, since if you are paying for content there is enough revenue to cover bandwidth costs. That is how iTunes works and it works fine.

      The reason these things don’t exist is that the record labels have no interest in doing them (they even fought iTunes), not that they are hard to do or inefficient.

      • My hasty comment could have been clearer.

        I’m suggesting that the state of the art/knowledge/science of distributed systems is considerably in advance of current implementations, because there’s little money and a lot of danger involved in contributing to any implementation. Remove that legal/moral stigma, and the technology progresses far more rapidly. It is a testament to mankind’s insatiable pursuit of progress that it’s happening anyway, thanks to outlaws prepared to operate with far less funding and reward than they should otherwise enjoy.

        The flaws and shortcomings you observe in current implementations do not constitute intrinsic limitations of distributed systems technology per se. In other words, if there were no legal consequences of developing systems to efficiently distribute files whilst ensuring their integrity, then the idea of discrete server computers connected by a network would by now be a quaint notion akin to punched cards. The Internet would today behave and be properly perceived as a pure distributed system, with no notion of copying or downloading files or purchasing MP3s from iTunes. Instead of the anachronistic notion of paying publishers for copies we’d be paying musicians for their music.

        Mitch, there is no spoon. There is no copy. There is only art, design, knowledge.