August 23, 2017

The Stock-market Flash Crash: Attack, Bug, or Gamesmanship?

Andrew wrote last week about the stock market’s May 6 “flash crash”, and whether it might have been caused by a denial-of-service attack. He points to a detailed analysis by nanex.com that unpacks what happened and postulates a DoS attack as a likely cause. The nanex analysis is interesting and suggestive, but I see the situation as more complicated and even more interesting.

Before diving in, two important caveats: First, I don’t have access to raw data about what happened in the market that day, so I will accept the facts as posited by nanex. If nanex’s description is wrong or incomplete, my analysis won’t be right. Second, I am not a lawyer and am not making any claims about what is lawful or unlawful. With that out of the way …

Here’s a short version of what happened, based on the nanex data:
(1) Some market participants sent a large number of quote requests to the New York Stock Exchange (NYSE) computers.
(2) The NYSE normally puts outgoing price quotes into a queue before they are sent out. Because of the high rate of requests, this queue backed up, so that some quotes took a (relatively) long time to be sent out.
(3) A quote lists a price and a time. The NYSE determined the price at the time the quote was put into the queue, and timestamped each quote at the time it left the queue. When the queues backed up, these quotes would be “stale”, in the sense that they had an old, no-longer-accurate price — but their timestamps made them look like up-to-date quotes.
(4) These anomalous quotes confused other market participants, who falsely concluded that a stock’s price on the NYSE differed from its price on other exchanges. This misinformation destabilized the market.
(5) The faster a stock’s price changed, the more out-of-kilter the NYSE quotes would be. So instability bred more instability, and the market dropped precipitously.

The first thing to notice here is that (assuming nanex has the facts right) there appears to have been a bug in the NYSE’s system. If a quote goes out with price P and time T, recipients will assume that the price was P at time T. But the NYSE system apparently generated the price at one time (on entry to the queue) and the timestamp at another time (on exit from the queue). This is wrong: the timestamp should have been generated at the same time as the price.

But notice that this kind of bug won’t cause much trouble under normal conditions, when the queue is short so that the timestamp discrepancy is small. The problem might not have be noticed in normal operation, and might not be caught in testing, unless the testing procedure takes pains to create a long queue and to check for the consistency of timestamps with prices. This looks like the kind of bug that developers dread, where the problem only manifests under unusual conditions, when the system is under a certain kind of strain. This kind of bug is an accident waiting to happen.

To see how the accident might develop and be exploited, let’s consider the behavior of three imaginary people, Alice, Bob, and Claire.

Alice knows the NYSE has this timestamping bug. She knows that if the bug triggers and the NYSE starts issuing dodgy quotes, she can make a lot of money by exploiting the fact that she is the only market participant who has an accurate view of reality. Exploiting the others’ ignorance of real market conditions—and making a ton of money—is just a matter of technique.

Alice acts to exploit her knowledge, deliberately triggering the NYSE bug by flooding the NYSE with quote requests. The nanex analysis implies that this is probably what happened on May 6. Alice’s behavior is ethically questionable, if not illegal. But, the nanex analysis notwithstanding, deliberate triggering of the bug is not the only possibility.

Bob also knows about the bug, but he doesn’t go as far as Alice. Bob programs his systems to exploit the error condition if it happens, but he does nothing to cause the condition. He just waits. If the error condition happens naturally, he will exploit it, but he’ll take care not to cause it himself. This is ethically superior to a deliberate attack (and might be more defensible legally).

(Exercise for readers: Is it ethical for Bob to deliberately refrain from reporting the bug?)

Claire doesn’t know that the NYSE has a bug, but she is a very careful programmer, so she writes code that watches other systems for anomalous behavior and ignores systems that seem to be misbehaving. When the flash crash occurs, Claire’s code detects the dodgy NYSE quotes and ignores them. Claire makes a lot of money, because she is one of the few market participants who are not fooled by the bad quotes. Claire is ethically blameless — her virtuous programming was rewarded. But Claire’s trading behavior might look a lot like Alice’s and Bob’s, so an investigator might suspect Claire of unethical or illegal behavior.

Notice that even if there are no Alices or Bobs, but only virtuous Claires, the market might still have a flash crash and people might make a lot of money from it, even in the absence of a denial-of-service attack or indeed of any unethical behavior. The flood of quote requests that trigged the queue backup might have been caused by another bug somewhere, or by an unforeseen interaction between different systems. Only careful investigation will be able to untangle the causes and figure out who is to blame.

If the nanex analysis is at all correct, it has sobering implications. Financial markets are complex, and when we inject complex, buggy software into them, problems are likely to result. The May flash crash won’t be the last time a financial market gyrates due to software problems.