December 22, 2024

Source Code and Object Code

[This item is long and geeky. Sorry about that. I hope that at least some of you will find it interesting. The rest of you can skip right to the (slightly) pithier items below.]

When lawyers discuss software, they typically draw distinctions between source code and object code. These distinctions often fail to account for the nature of real code, and so are less useful than they seem. Indeed, they are often misleading.

Here’s an example of a source/object code distinction. Larry Lessig proposes that object code be subject to copyright (as it is today), but only if the associated source code is put in escrow, to be released when the copyright expires (a requirement that does not exist today).

The source/object code distinction assumes that code is developed in a particular way, which I’ll call the Standard Scenario for short. In the Standard Scenario, the programmer writes code in a textual form (“source code”), and this code is translated (by a program called a compiler) into another form (“object code”) which can be executed directly by a computer. The code is then distributed in object code form.

In the Standard Scenario, there are indeed two forms of code. Source code is human-readable but cannot be executed directly. Object code is not human readable but can be executed by a computer. Much information is lost in the translation from source to object code, so the object code cannot be used to reconstruct fully the original source code.

The problem is that the Standard Scenario is becoming less and less standard.

As David Reed points out, sometimes there is only one form of code. In a scripting language like Perl, the programmer writes source code. However, rather than translating that code into object code, the code is fed directly to a program called an interpreter, which carries out the code’s instructions – but without ever translating the code into object code. So there is no object code, and the code is distributed and executed in the original source code form.

What Reed doesn’t say is that sometimes there are more than two forms of code. In Java, for instance, a programmer writes code in the Java language. This code is translated by a compiler into a form called “bytecode” which, contrary to the Standard Scenario, is not object code, i.e. it cannot be executed directly by a computer. The code is distributed in bytecode form, and it can be executed in one of two ways: by feeding it directly to an interpreter, or by having the consumer translate it into object code. (The latter is now more common.) Thus, in Java there are (usually but not always) three forms of code: one for writing the program, one for distributing it, and a third one for executing it.

But wait – it gets worse. Even object code can get translated into another form before being executed. For example, Intel’s popular Pentium II microprocessor, which is the heart of many PCs, takes the “object code” that it is given and translates it into another form called micro-op code. It is the micro-op code that actually gets executed by the processor. This translation happens right inside the microprocessor, and is done because the chip’s designers found it easier to execute the translated form of code than the original object code.

The upshot of all this is that the tidy assumptions made in many legal analysis often do not hold. Object code is not the only form of code that can be executed. Source code can sometimes be executed without an intervening translation to object code. Executable code can be human-readable. Code can be distributed in a form that is neither source code nor object code. Object code can be translated further. An analysis that simply says “Treat source code this way, and object code that way” is not complete.

This is not to say that the situation is hopeless. Sometimes you know you’re in the Standard Scenario, and none of these complications arise. Even if you’re in a different scenario, there are sensible distinctions to be drawn and sensible conclusions that can be reached. What you need is a rule that will apply to code in any of its myriad forms.

As an example, we can rewrite Lessig’s rule in a way that accounts for all of the different types of code. Where Lessig said, “You can copyright object code, as long as you escrow the source code,” we can say, “You can copyright any form of code, as long as you escrow that code in the form in which you customarily read and edited it.” This captures the spirit of Lessig’s rule, which is that once the copyright expires, anyone should be able to read and modify the code just as readily as the initial author could. (Lessig understands code pretty well – for a lawyer – so let’s give him the benefit of the doubt and assume that this is what he meant all along.)

Other legal rules might not fare so well. For example, a rule saying “source code is constitutionally protected speech, but executable code is not” would be incomplete and inconsistent. (It might also be wrong constitutionally, but let’s ignore that issue.) A more clearly stated rule might say “Code is constitutionally protected speech if it is human-readable,” which is essentially the same as saying “… if a human reader can extract meaning from it.” Now we have a rule that is complete and consistent, but we no longer have the illusion of a bright line rule.

Misleading Term of the Week: "Content Owner"

Many discussions of copyright refer to “content owners.” The language of ownership is often misused in these contexts, for example by saying that Disney “owns” The Lion King, or by saying that I “own” the content on this site.

The simple fact is that I don’t own the content on this site – at least not in the same way that I own my car. All I own is the copyright on the content. The copyright gives me a certain limited bundle of rights, and leaves for you, the reader, certain other rights, whether I like it or not. Using the rhetoric of “content ownership” confuses the issue, by falsely implying that the copyright owners have more rights than the law really gives them.

(It’s relatively harmless to refer to “my book” or “my film,” as long as everybody understands that you’re not claiming ownership of the content but merely stating a relationship, just as you might refer to “my brother” or “my hometown” without implying that you own either one.)

Greece Bans Electronic Games

CNet reports that Greece has banned all electronic games, including ones that run on PCs or on mobile phones, apparently in an effort to crack down on gambling.

This is yet another example of the inflationary theory of censorship. A ban on gambling would be too hard to enforce, because there is no way to tell whether a person playing, say, a card game is playing it for real money. So the censorship expands to a larger boundary that is supposedly more defensible.

Sites Blocked In China

Jonathan Zittrain and Ben Edelman at Harvard have a site listing URLs that are blocked in China. In addition to some obvious sites (related to things like Chinese dissidents, the Taiwanese government, and Falun Gong), there are some curious sites on the block list, including the U.S. Federal Court system (uscourts.gov).

You can go to their site and try out any URL you like, to see if it is blocked in China.

Apple Uses DMCA Threat Against Competing Product

Declan McCullagh at news.com reports on Apple’s use of a DMCA threat to force a useful product off the market.

Apple’s iDVD application allows the user to burn DVDs – but only on Apple-brand drives. A DVD drive vendor called Other World Computing shipped its drives with a “DVD Enabler” program that modified iDVD so that it could burn DVDs on any FireWire-connected drive.

Apple was displeased, so it used various threats, including one based on the DMCA, to convince Other World to back down and yank DVD Enabler from the market. According to the story, the main reason for Other World’s quick backdown was a general desire to stay on Apple’s good side. But clearly Apple thought the DMCA threat would have some impact, or they wouldn’t have made it.

Apple’s use of the DMCA here has nothing to do with preventing copyright infringement, since Apple-brand drives can make infringing copies just as easily as other brands can. The real motive is to weaken competition in the market for Mac-compatible DVD drives.