October 9, 2024

Privacy, Recording, and Deliberately Bad Crypto

One reason for the growing concern about privacy these days is the ever-decreasing cost of storing information. The cost of storing a fixed amount of data seems to be dropping at the Moore’s Law rate, that is, by a factor of two every 18 months, or equivalently a factor of about 100 every decade. When storage costs less, people will store more information. Indeed, if storage gets cheap enough, people will store even information that has no evident use, as long as there is even a tiny probability that it will turn out to be valuable later. In other words, they’ll store everything they can get their hands on. The result is that more information about our lives will be accessible to strangers.

(Some people argue that the growth in available information is on balance a good thing. I want to put that argument aside here, and ask you to accept only that technology is making more information about us available to strangers, and that an erosion of our legitimate privacy interests is among the consequences of that trend.)

By default, information that is stored can be accessed cheaply. But it turns out that there are technologies we can use to make stored information (artificially) expensive to access. For example, we can encrypt the information using a weak encryption method that can be broken by expending some predetermined amount of computation. To access the information, one would then have to buy or rent sufficient computer time to break the encryption method. The cost of access could be set to whatever value we like.

(For techies, here’s how it works. (There are fancier methods. This one is the simplest to explain.) You encrypt the data, using a strong cipher, under a randomly chosen key K. You provide a hint about the value of K (e.g. upper and lower bounds on the value of K), and then you discard K. Reconstructing the data now requires doing an exhaustive search to find K. The size of the search required depends on how precise the hint is.)

This method has many applications. For example, suppose the police want to take snapshots of public places at fixed intervals, and we want them to be able to see any serious crimes that happen in front of their cameras, but we don’t want them to be able to browse the pictures arbitrarily. (Again, I’m putting aside the question of whether it’s wise for us to impose this requirement.) We could require them to store the pictures in such a way that retrieving any one picture carried some moderate cost. Then they would be able to access photos of a few crimes being committed, but they couldn’t afford to look at everything.

One drawback of this approach is that it is subject to Moore’s Law. The price of accessing a data item is paid not in dollars but in computing cycles, a resource whose dollar cost is cut in half every 18 months. So what is expensive to access now will be relatively cheap in, say, ten years. For some applications, that’s just fine, but for others it may be a problem.

Sometimes this drop in access cost may be just what you want. If you want to make a digital time capsule that cannot be opened now but will be easy to open 100 years from now, this method is perfect.

Comments

  1. Ed, I agree that the technology you describe has its uses. But as a computer security professional, I spend a substantial amount of my time (as do you, I suspect) gently leading people towards careful analysis of their security needs and requirements, and away from some flashy technological “silver bullet” that they’ve fallen for after some huckster’s hard-sell, or misunderstood from technical material–or latched onto after reading about it in a well-intentioned-but-incomplete popular treatment.

    Somebody (possibly Butler Lampson, Peter Neumann, or the late Roger Needham) once said something to the effect that “if you think cryptography is the solution to your problem, you either don’t understand the cryptography or you don’t understand your problem.” I wouldn’t go as far as that myself, but I do believe that cryptography is too often invoked either to dodge the difficult work of problem definition, or to force a particular not-quite-correct problem definition, or to disguise the difficulty (or even impossibility) of solving an already well-defined problem. Hence my impulse to ask tough, skeptical questions whenever I see a cryptographic technique presented in a way that some might misinterpret as attributing to it broader, more powerful or more revolutionary properties than it actually has.

  2. Dan:

    This problem — how to balance law enforcement access to information against privacy — is not the sort that can ever be solved completely.

    As often happens, using crypto lets us trade one set of problems for another set that we might like better. Here, the cost that law enforcement must pay to get access, which was previously paid in courtroom hassle and the risk of misbehavior by the human gatekeepers that hold the data, is now paid in dollars. And you give up the ability to change the access rules later (as you could do in the current system).

    Reasonable people could debate whether the change would be an improvement. I tried to put that question aside in my original post. My goal in writing the post was simply to teach non-expert readers about this cryptographic trick, which does have its uses.

  3. Okay, let’s see if I’ve understood the scheme outlined by the posting and its subsequent commentators: law enforcement authorities collect surveillance data, and agree to store it in a manner (encrypted with searchable keys) such that it can only be accessed slowly, with a fair bit of effort (searching the key space for each key). In an emergency, an authority (the key escrow official) can make it possible to access the material more quickly and easily, and methods (such as secret-sharing) can be used to require that just cause has been established for immediate access.

    This is supposed to be much better than the old, untrustworthly, insecure, non-cryptographic approach to the problem, which typically works roughly as follows: law enforcement authorities collect surveillance data, and agree to store it in a manner (locked in a controlled-access computer room) such that it can only be accessed slowly, with a fair bit of effort (obtaining a court order for each access). In an emergency, an authority (the guy with the key to the computer room) can make it possible to access the material more quickly and easily, and methods (such as judicial review) can be used to require that just cause has been established for immediate access.

    And cryptography’s contribution to improving the solution is….?

  4. Unfortunately, Ed’s scenario establishes a fundamental difference in privacy rights. For someone with a lot of money, purchasing the computing power will be a trivial expense. For someone with less money, it may be insurmountable.

    Such a difference can establish two classes of citizen — those with the ability to search for information related to other citizens and those without the ability.

  5. Another idea I like along these lines is Matt Blaze’s Oblivious Key Escrow. The idea is that data gets encrypted and shares distributed to the public at large, such that only if some hundreds or thousands of people agree can it be decrypted. In a surveillance context, if a crime were committed at a certain time and place the public would cooperate to assist in decrypting the necessary data records.

    However I think in this topic it is premature to focus on mechanism, when policy is so much more pressing an issue. Civilians may be able to limit official police surveillance, but what about private surveillance, what limitations does society have the right to impose on what a person can record in public? That’s much less clear. If enough people engage in private surveillance and make their records available, then we either need to criminalize that activity or else accept the existence of de facto universal public surveillance. Some of David Brin’s science fiction novels depict this kind of situation, where you get the effect of universal surveillance simply by private, voluntary actions.

  6. Paul, I don’t think your solution to Moore’s Law would be completely effective. What would stop someone who for whatever reason decided to cache all the hints today, and use them in a couple of years?
    An abuser would, admittedly, have to be pre-meditating the abuse, but for a valuable target, maybe it would be worth it.

  7. Oops, that’s probably what you were alluding to when you said “there are fancier methods”!

  8. Ed’s idea and TimH’s idea are not incompatible. Instead of disposing of the key, the system could write out a fully usable copy on a token and the token can be given to a third party who is generally trusted to log its use. Need slow, unlogged access: use brute force. Need quick, logged access: ask for the token.

    There is also a solution to the Moore’s Law problem. Make the key very long and have a trusted box that emits bits of hint over time. The box emits the hints more slowly every year to compensate for the speed of the processors.

  9. The problem is that the data being concealed needs to be really well indexed to make it genuinely unecessary for human users to feel they have to eyeball the data.

    Also, NSA/CIA/FBI etc will either retain instant access to the raw data, or run processes that cache the raw for them. Who would know? So the protection is really only effective against individual bad cop type behaviour, passing data on to media, PIs etc.

    I would prefer to install a really secure logging scheme instead, that ensures that every data access is logged against a specific individual. It’s accountability for abuses I’m after, not an attempt to limit the abuses.