November 27, 2020

Can P2P Vendors Block Porn or Copyrighted Content?

P2P United, a group of P2P software vendors, sent a letter to Congress last week claiming that P2P vendors are unable to redesign their software to block the transmission of pornographic or copyrighted material. Others have claimed that such blocking is possible. As a technical matter, who is right?

In this post I’ll look at what is technically possible. I’ll ignore the question of whether the law does, or should, require P2P software to be redesigned in this way. Instead, I’ll just ask whether it would be technologically possible to do so. To keep this post (relatively) short, I’ll omit some technical details.

I’ll read “blocking copyrighted works” as requiring a system to block the transmission of any particular work whose copyright owner has complained through an appropriate channel. The system would be given a “block-list” of works, and it would have to block transmissions of works that are on the list. The block-list would be lengthy and would change over time.

Blocking porn is harder than blocking copyrighted works. Copyright-blocking is looking for copies of a specific set of works, while porn-blocking is looking for a potentially infinite universe of pornographic material. Today’s image-analysis software is far, far too crude to tell a porn image from a non-porn one. Because porn-blocking is strictly harder than copyright-blocking, I’ll look only at copyright-blocking from here on. P2P United is correct when they say that they can’t block porn.

Today’s P2P systems use a decentralized architecture, with no central machine that participates in all transactions, so that any blocking strategy must be implemented by software running on end users’ computers. Retrofitting an existing P2P network with copyright-blocking would require blocking software to be installed, somehow, on the computers of that network’s users. It seems unlikely that an existing P2P software vendor would have both the right and the ability to force the necessary installation.

(The issues are different for newly created P2P protocols, where there isn’t an installed base of programs that would need to be patched. But I’ll spare you that digression, since such protocols don’t seem to be at issue in P2P United’s letter.)

This brings us to the next question: If there were some way to install blocking software on all users’ computers, would that software be able to block transmissions of works on the block-list? The answer is probably yes, but only in the short run. There are two approaches to blocking. Either you can ban searches for certain terms, such as the names of certain artists or songs, or you can scan the content of files as they are transmitted, and try to block files if their content matches one of the banned files.

The real problem you face in trying to use search-term banning or content-scanning is that users will adopt countermeasures to evade the blocking. If you ban certain search terms, users will deliberately misspell their search terms or replace them with agreed-upon code words. (That’s how users evaded the search-term banning that Napster used after Judge Patel’s injunction.) If you try to scan content, users will distort or encrypt files before transmission, so that the scanner doesn’t recognize the files’ content, and the receiving user will automatically restore or decrypt the files after receiving them. If you find out what users are doing, you can fight back with counter-countermeasures; but users, in turn, will react to what you have done.

The result is an arms race between the would-be blockers and the users. And it looks to me like an unfavorable arms race for the blockers, in the sense that users will be able to get what they want most of the time despite spending less money and effort on the arms race than the blockers do.

The bottom line: in the short run, P2P vendors may be able to make a small dent in infringement, but in the long run, users will find a way to distribute the files they want to distribute.

Comments

  1. Not only is our image processing software too crude to be able to distinguish porn from non-porn, we ourselves can’t do it to each others’ satisfaction. If you can’t even get two people to always agree on whether or not a given image or story is porn, how could we possibly hope to make software that could do the job?

  2. One of the other issues is how open the network/protocols are. If you are decentralized, it will be difficult, if not impossible, to keep non-compliant clients off the network. Of course, this also makes maintaining a money-making network difficult as well.

  3. There are really several subissues here:

    1) A plurality of data encoding schemes.
    If I send the text of a book as plain ASCII, then blocking software searching for ASCII words will be able to match strings. But if I send the file in Unicode, EBCDIC, etc then the tests will fail.
    JPEGs can be watermarked… but sending images in either a archaic encoded form (like PCX) or a new contrived format can make recognition of the watermarks impossible. If I pick an archaic format then I don’t even need to share code with other P2P users…

    2) Fair use and false positives
    Any attempt to key off phrases (for text) or for small features (for images) will result in false positives. The solution is increasing the pattern size that is matched against, but this bloats the end-user application (as well as at the terminal extreme distributing the copyrighted works to the people who you don’t want to have them!)
    False positives will cause even people who are non-infringers to seek more open, less restrictive software… encouraging a P2P black market. (I think there’s a relevant analogy here between police radar and radar detectors.)

    3) Common carrier laws
    I’m not a lawyer, but I was under the impression that as long as a service provides open transport and doesn’t examine the content that it’s moving then it is shielded from issues relating to it’s traffic. But once the service starts moderating then anything that slips by it’s imperfect moderation is fair game for civil and criminal action.
    This type of screening for copyrighted content would be moderation. And while other parts of the law may resolve the carrier’s liability for contributory copyright infringement, what about the 98% of the law. If person A allows P2P users to download a text file instructing them how to defraud a company/individual but the text file isn’t copyrighted… is the carrier now liable since they were screening the content but didn’t block this file?

    4) Porn is mental metadata
    There is no attribute of an image that indicates porn or not porn, it’s an inferred quality that derives more from the users reference of self & culture than from the actual iamge data. A conventional “american” definition of porn would probably be far different than a Saudi definition.

  4. Cypherpunk says:

    According to http://en.wikipedia.org/wiki/FastTrack, “proprietary FastTrack clients are configured to automatically download software updates, making it easy to change the protocol.” This provides an entry point where these networks could gain considerable control over the actions of their customers, including installing filtering software. Apparently this technique has been used in the past to shut out Morpheus and giFT clients from the network.

    Another source of control is the so-called supernode. “In order to be able to initially connect to the network, a list of supernode IP numbers is hardcoded in the program.” These special supernodes would be an ideal location to install filters as they are probably used by the great majority of clients. The supernode protocol is said to still be proprietary, keeping this important resource under control of the companies which created the networks.

    As for the arms race, I used Napster in the days of misspellings, and it greatly decreased the usability of the network. I fear you are guilty of binary thinking here, that if you can’t shut the network down completely then nothing has been gained. It may well be sufficient to make the system hard to use. Once people have to resort to guessing spellings, it becomes a real chore to find what you want, especially as the misspelled songs are gradually swept from the system (as happened with Napster).

    I am disappointed that your analysis has very predictably aligned itself with the interests of the group that you support. An objective analyst would include points favorable to both sides of the issue. Is your goal in writing here to provide advocacy or unbiased commentary?

  5. As far as I can see, the experiences with Napster can’t really be extended to the general case of decentralized P2P networks, especially less proprietary ones. New P2P clients may very well just send encrypted data blocks around without any association to file content, and have the lookup service for a specific search completely independent of the P2P network itself (like torrent search engines for BitTorrent). This could already be implemented in any of the existing file sharing networks.

    The key and search metadata can be accessed through anonymizers, while the bulk file share clients can always deny direct or indirect infringement due to not having the keys and cleartext (as opposed to the ciphertext blocks that are shared) present. I don’t see an easy way for the media industry around that. I agree that it will put a big hurdle up for the casual user, but that’s not really the group who does the damage to the industry, isn’t it?

    Also, the newer P2P networks do not use supernode servers (think overnet), or place them under user control (bittorrent), therefore that part of your argument seems to be limited to the early generation P2P networks.

    In addition, the current behavior of the media industry in regard to trying to hinder or prevent commercial P2P endeavours looks to me pretty counter-productive. By working against these corporate efforts, the incentive to write independent/uncontrollable P2P clients is increased and the arms race places additional monetary strain on the media industry itself, instead of making use of the new distribution channel that’s offered for free. I agree with you that FastTrack could easily implement a pay-for-download scheme for legal content with a forced update, and the success of itunes pretty much spells out that this can be profitable for both the P2P network provider and the media industry.
    But for that, I guess the the latter would have to accept loosing control over the strictly regulated distribution process.

  6. Cypherpunk builds up a strawman argument about one particular implementation of a peer-to-peer protocol and then asks why it was not provided. The answer is – there is no point in using strawmen. There is nothing that requires P2P protocols (generically) to contact a central server. The point of a P2P network is that they can be and many are decentralized. Requiring a central server controlled by the recording industries means that local area networks could not use P2P protocols detached from the public Internet. That is a ridiculous limitation to place on a generic technology. The arguments being put forward by Prof Felton are objective and unbiased.

  7. Both Cypherpunk and Fuzzy are correct, to a degree.

    Cypherpunk states that the circumvention schemes will impose a burden on the user, either in direct terms (having to guess misspellings) or indirect terms (having to constantly patch his client to keep up in the arms race).

    However, the law as proposed does not accept a level of grayness – if enacted, then P2P networks must, by law, be all white.

    So, even if it is difficult, even if the number of illegal sharers is reduced to 10% or even 1% of the number of users that exist now, then this will still be enough for the powers that be to bring out the heavy guns against the ISPs that host them.

    Sean.

  8. inspiration says:

    All you need to do is use, Kazaa lite k++ they can’t even see your ip and all the spy ware has been taken out…and they can’t update the software remotly. K lite has been closed down b/c kazaa is sueing them, but you can still find it if you look hard enough.
    😉