Sometimes the same idea comes flying at you from several directions at once, and you start seeing that idea everywhere. This has been happening to me lately with needle-in-a-haystack problems, a concept that is useful but often goes unrecognized.
A needle-in-a-haystack problem is a problem where the right answer is very difficult to determine in advance, but it’s easy to recognize the right answer if someone points it out to you. Faced with a big haystack, it’s hard to find the needle; but if someone tells you where the needle is, it’s easy to verify that they’re right.
This idea came up in a class discussion of Beth Noveck’s Wiki-Government paper. We were talking about how government can use crowdsourcing to learn from outside experts, and somebody pointed out that crowdsourcing can work well for needle-in-a-haystack problems. If somebody in the crowd knows the answer, they can announce it, and other participants in the discussion will recognize that the proposed answer is correct. This is the basis for the open-source maxim that “given enough eyes, all bugs are shallow”. It’s also the theory underlying the Peer to Patent project, in which distributed experts tried to find the best prior art document that would invalidate a patent application.
The idea came up again in a discussion of online passwords. A highly secure system doesn’t remember your password, because an attacker who managed to observe the stored password could impersonate you. But if the system doesn’t store your password, how can it verify that the password you enter is correct?
The answer is that the system can create a needle-in-a-haystack problem to which your password is the only right answer. By remembering this problem rather than your password, the system will be able to recognize your password when it sees it, but an attacker who learns what the problem is will have great difficulty determining your password. There are standard cryptographic tricks for generating this kind of needle-in-a-haystack problem.
The idea came up a third time when I read a paper with the eye-catching title “Why Most Published Research Findings Are False“. The paper explains why the vast majority of research findings in a field might be false, even if the researchers do everything right; and it suggests that this is the case for some areas of medical research.
To see how this might happen, imagine a needle-in-a-haystack problem that has one right answer and a million wrong answers. Suppose that our procedure for verifying the right answer when we see it is not flawless but instead is wrong 0.01% of the time. Now if we examine all of the 1,000,000 wrong answers, we will incorrectly classify 100 of them (0.01% of 1,000,000) as correct. In total, we’ll find 101 “correct” answers, only one of which is actually correct. The vast majority of “correct” answers — more than 99% of them — will be wrong.
Any research field that is looking for needles in haystacks will tend to suffer from this problem, especially if it relies on statistical studies, which by definition can never yield absolute 100% confidence.
Which brings us back to crowdsourcing. If we’re not perfect at recognizing proposed answers as correct — and no human process is perfect — then a crowdsourcing approach to needle-in-a-haystack problems could well yield wrong answers most of the time.
How can we avoid this trap? There are only two ways out: reduce the size of the haystack, or improve our procedure for evaluating candidate answer. Here crowdsourcing could be helpful. If we’re lucky, vigorous discussion within the crowd will help to smoke out the false positives. It’s not just access to experts that helps, but dialog among the experts.