June 23, 2017

Conscientious Objection in P2P

One argument made against using P2P systems like Grokster was that by using them you might participate in the distribution of bad content such as infringing files, hate speech, or child porn. If you use the Web to distribute or read content, you play no part in distributing anything you find objectionable – you only distribute a file if you choose to do so. P2P, the argument goes, is different.

Today I want to consider what you can do if you want to use P2P to access files, but you want to avoid participating in any way in the distribution of bad files. When I say a file is “bad” I mean only that you, personally, have a strong moral objection to it, so that you do not want to participate in its distribution. Different people will have different ideas about which files (if any) are bad. Saying that a file is bad is not the same as saying that it should be banned or that others should not be allowed to distribute it – choosing not to do something yourself is not the same as banning others from doing it. So this is not about censorship.

The original design of BitTorrent was friendly to those who wanted to avoid distributing bad files. You could distribute any files you liked, and by default you would automatically redistribute any file that you had downloaded. But you wouldn’t find yourself distributing any bad files (unless you downloaded bad files yourself), or even helping anybody find bad files. Others could read or publish what they wanted, but you wouldn’t help them unless you wanted to.

This is unlike Grokster or Gnutella, where your computer would (by default at least) help to construct an index that would help people find files of all types, including some bad files. You might think that’s fine and choose to participate in it, but then again you might be unhappy if the proportion of bad files that you were helping to index was too high for your taste, or their content too vile. Because BitTorrent didn’t have a built-in index, you could use it without running into this issue.

But then, about ten months ago, a new “trackerless” version of BitTorrent came along. This version had a big distributed index, provided cooperatively by the computers of everybody who was using BitTorrent. After this change, if you were using BitTorrent, you were helping to index files. (Strictly speaking, you would be providing “tracker information” for the files; I’m using “index” as shorthand.) Some of those files might be bad.

To be precise, you would be helping to index a small, and randomly chosen, subset of all the BitTorrent files in the world. And if it came to your attention that one of those files was bad, you could choose not to participate in indexing it, by simply refusing to respond to index queries about that file. Standard BitTorrent software doesn’t support this refusal tactic, but the tactic is possible given how the BitTorrent protocol is designed.

Your refusal to provide index information for a file would not, by itself, make the file unavailable. BitTorrent stores index information redundantly, so other people could answer the index queries that you refused to answer. Only if all (or too many) of the people assigned to index a file refused to do so would that file disappear.

If lots of people started refusing to index files they thought were bad, this would amount to a kind of jury system, in which each file was assigned to a random set of BitTorrent “citizens” who voted (by indexing, or refusing to do so) on whether the file should be available. If too many jurors voted to suppress a file, it would disappear.

By now, some of you are jumping up and down, shaking your fingers at me. This is an affront to free speech, you’re saying – every file should be available to everybody. To which I reply: don’t blame me. This is the way BitTorrent is designed. By switching to the trackerless protocol, BitTorrent’s designers created this possibility. And the problem – if you consider it one – can be fixed. How to fix it is a topic for another day.

Comments

  1. > By now, some of you are jumping up and down, shaking your fingers at me. This is an affront to free speech, you’re saying — every file should be available to everybody. To which I reply: don’t blame me.

    What??? It is no such thing. It is about some people choosing to not speak about certain subject. The right to remain silents is just a essential as the right to speak.

  2. Ed, my understanding of the trackerless protocol, as documented at http://www.bittorrent.org/Draft_DHT_protocol.html, is that it does not “index” files in the way I would normally think of that term.

    Given a hash of a file (which you must learn because you are actively seeking that particular file) the protocol can help you find peers that have chosen to distribute that file (again only because they have chosen to do so) and you can exchange blocks of the file with those peers.

    The only unwitting behaviour is the participation in the mapping from the hash to the peers. In the earlier version of the protocol, only the tracker program would introduce peers, though it does not participate in exchanging file blocks.

    In truth, any read/write store on the net (even blog comments) could be used as a means for peers to register availability of a given hash and for others to find them, so I’m not sure any great leap has been taken.

    (NB I am affiliated with Bittorrent Inc. though I am not invovled in the protocol design so don’t take my words as authoritative.)

  3. Normally I’m pro free speech. Sometimes I even jump around.
    But, some things really are bad, no qualifiers. Not very many, perhaps a very small number. I don’t want to involke a debate as to what is in that set of bad things, but I do want to assert that it is right (and moral) to assert that set exists. I’ll even go as far as to say that I think you could have asserted that more strongly. Granted, that wasn’t what the piece was about.

    The description of the problem and the new opportunites present in Bittorrent was enlightening and well written (thank you). This is great, it gives me an excellent reference I can point my peers to for why it is a good solution.

  4. the_zapkitty says:

    The interesting part is that BitTorrent, unlike other P2P networks, was never intended for use as an illegal file channel, and has real uses independent of that.

    I myself use it to get new releases of various Linux distros. But when the distributed tracking network was added I opted out. You can too.

    It’s possible, but it is not easy or obvious from within most BT clients… and it should be.

    This opt-out isn’t a difficult option to build into clients as it doesn’t require the per-file granularity that Ed’s selective filtering would. Azureus, for example, gives menu options for switching distributed tracking off. (And for the pedantic: yes, Azureus uses a different distributed tracker process than the official BT client, but the principles involved are the same)

    An option for clients which don’t enable you to opt out is to block UDP use on the ports you use for BT… … ….which, again, many users wouldn’t regard as easy or obvious.

  5. Jesse Weinstein says:

    Brad –
    While your clarification is useful, and I assume, quite accurate, it doesn’t really address what I understood Ed’s point was, which was the previous to the trackerless protocol, a standard bittorrent client only handled information related to the files it’s user was downloading or had downloaded, and now, with the trackerless prototocol, a standard client also handles information related to files which the user of the program has no interest in, and may even be opposed to the distribution of. From what you said, this is true.

    Possible misuses of the term “indexing” aside, Ed’s point remains.

  6. Brad,

    You’re right that I was using the word “index” sloppily. That’s what happens when I write on Friday afternoon. What I wanted was a shorthand word for the information needed to find pieces of a file.

    Anyway, here’s a more detailed and technical explanation of how it works:

    All of the BitTorrent peers cooperate to implement a Distributed HashTable (DHT) structure, using the Kademlia DHT algorithm. To get a file, you first get the corresponding Torrent file, which contains information such as the length of the file and the hashes of the data blocks that make up the file. Using information from the Torrent file, you compute a hash key. Under that key, the DHT stores a list of peers who are currently participating in distribution of the file you want. You can then contact those peers and get the file’s contents.

    I was using “index” as a shorthand for the DHT, and “index information about a file” as shorthand for the information in the DHT about which peers have parts of the file. A peer who objects to a file can refuse to store DHT entries relating to that file, assuming the peer has responsibility for storing those DHT entries. Information in the DHT is replicated (five ways, if I recall correctly), and the peers who will store replicas for a file are chosen in a randomish fashion. Hence the “random jury” effect.

  7. First, I’m curious under the trackerless model, whether it’s possible to query which computers have been assigned to index which files.
    I think this brings forward the philosophical question of anonymous speech in the BitTorrent environment. If there is an idea that no one is willing to take direct responsibility to index, you’ve clearly pointed out how that idea could be censored.
    There are valuable forms of speech (my definition of value here is based in use) which people may be willing to download and enjoy, but unwilling to distribute or index. This creates an interesting double standard in the system, where one can enforce a moral censorship without subscribing to it.
    Also, given the restrictive nature of our current copyright system, there could be legal ramifications for merely choosing not to opt out of indexing a file.

  8. the_zapkitty says:

    Brian Says:
    “First, I’m curious under the trackerless model, whether it’s possible to query which computers have been assigned to index which files.”

    I don’t think so. As was noted above there’s no top-level indexes to query.

    The distributed tracker is geared to query the nodes known to your client for a hash close to, but not necessarily identical to, the hash of the specific file being sought. Then the tracker walks those nodes to find further nodes with hashes close to or identical to the target hash.

    So, as currently implemented, you can’t use the distributed tracker to pull out a list of “suspect” PC’s… all you can do is locate individual PC’s hosting “suspect” files.

    Which the MPRIAA already does with its torrent moles and then tries to sue into oblivion.

  9. the_zapkitty says:

    I should sleep before posting…. I’m losing the war against apostrophes… 🙂

  10. Indeed, the peers are participating in the system that maps hashes of files to peers which are serving those files.

    However, I would suggest their participation is similar in nature to something like a DNS server. The operators of the major TLD DNS servers are of course, participating constantly in the finding of files which violate copyright or which they may oppose. Their action, however, is mostly viewed as a neutral one, and we can be thankful for that. Indeed, we get upset when people try to interfere with the neutrality of it.

    However, they certainly run those servers voluntarily, as do the search engines (which really do index) and it is fair to say that people who operate a DHT node should be aware of what that means. DHT directory systems have been proposed frequently, even as alternatives to DNS, and one would not find it a particularly strong argument to not participate in such a system because your computer would be pointing people to web sites you don’t like.

  11. Brad,

    I’m not making an argument about whether people should or shouldn’t opt out of full participation in the BitTorrent DHT. The arguments you make against opting out do have some force.

    I just think the ability of participants to opt out is an interesting feature of the new BitTorrent protocol, which was not present in the old protocol.

  12. Irrespective of the “moral” issues of being a tracker (via the DHT) for a torrent to which you might object, there are also incentives issues. By offering your services to the DHT to help others, in return the DHT helps you.

    It’s worth pointing out that the DHT is largely unused today (we’ve been collecting traces as part of ongoing research – nothing ready to publish yet but hopefully soon), and that clients don’t reach for the DHT unless the official tracker is unavailable. Furthermore, with most modern firewalls, NAT boxes, and so forth, the DHT will *never* work for some large number of Internet users (likewise, BitTorrent has trouble with firewalls and NATs).

    It’s also worth pointing out that there’s no enforcement to guarantee that DHT clients are also DHT participants, so a client could trivially refuse to honor requests to its node (i.e., DHT freeloading). It’s certainly possible to imagine engineering incentives into the DHT to make it an all-or-nothing game. Either you’re playing ball, or the DHT won’t honor your requests.

  13. It may also be worth reminding everyone that it has generally been an easy matter to opt out of the indexing duties in prior protocols (Fastrack, gnutella), since users may simply check “do not act as a super node.” Arguably, that solves the problem for users of those protocols who do not want to contribute to the distribution of “bad” files (while still contributing storage and bandwidth for other files, namely those in their own shared directories).

    Brad makes an interesting point about the fact that the computers that make up the DNS system may be contributing to the distribution of “bad” files every bit as much as those that maintain DHTs. But I’m not sure that he is correct that everyone would agree that DNS is a “neutral” service. I’m aware of at least one instance already where a copyright owner has demanded that a particular DNS record be deleted because the domain was persistently associated with infringing activity. I expect similar efforts in the future.

  14. Fred,

    Many systems give the user a single choice to opt out of all indexing activity. The new BitTorrent lets people opt out of indexing specific files, which is more interesting. DNS is at an intermediate point — it lets servers or caches opt out of helping to find specific machines, but each such machines might have many files.