January 10, 2025

More on Programs vs. Data

Karl-Friedrich Lenz reacts to my previous posting on how to distinguish programs from data, by insisting on the importance of having a simple definition of “program.” He is right about the value of a simple definition. And he is right to observe that my previous posting doesn’t argue against the existence of such a definition, although it does imply that the definition might be difficult to apply in practice. Lenz suggests a simple definition: “If the object instructs a computer to do something, it is a program. The remaining cases are data.”

Maximillian Dornseif weighs in with a thought-provoking example to show difficulty of applying this seemingly simple definition.

Another troublesome cases arises in logic programming, a style of programming that is implemented by programming languages like Prolog. In logic programming, you don’t tell the computer what to do or even how to do it. Instead, you specify the attributes of an object you want, and the computer figures out how to find or construct such an object. You state facts and relationships, and then you ask a question. At no point do you tell the computer what steps to execute or how to go about doing anything; that is all handled by a pre-packaged program called the Prolog Interpreter. Computer scientists talk about Prolog programs, but a Prolog program doesn’t seem to meet Lenz’s definition.

Now, we might try to stretch Lenz’s definition by saying that the Prolog program, even if it is only a listing of facts, does “instruct” the computer to do something, because the programmer wrote it knowing that it would cause the computer to behave in a certain way. But such a definition is too broad. A Word document is written with the purpose of causing the computer to do something, but that doesn’t make it a program. Besides, it seems unsatisfactory to call something a program or not based on the state of mind of its author.

Still, I’m not giving up on the quest for a simple definition.

Standards, or Collusion?

John T. Mitchell at InteractionLaw writes about the potential antitrust implications of backroom deals between copyright owners and technology makers.

If a copyright holder were to agree with the manufacturers of the systems for making lawful copies and of the systems for playing them to eliminate all trade in lawful copies unless each transaction (each resale, trade, gift or rental) has the consent of the copyright holder, there is of course no doubt that such agreement would constitute a naked restraint of trade. If, instead, the copyright holder agreed with the manufactures of copying and playing technologies to deploy a system which simply obeys the instructions of the copyright holder (including instructions which have the purpose and effect of eliminating the resale, trade, gift or rental of the copy, or of enlarging the copyright monopoly by charging for private performances), then the agreement to have technology automatically do the deed is certainly no better than the first. It is akin to a company saying to the prospective co-conspirator: “Listen, I can’t agree with you to do what you are asking because my lawyers tell me it would be illegal, so what I’ll do is program my machine to do what you tell it to do, but just don’t tell me.”

I understand that antitrust law is suspicious of backroom deals in which companies agree not to produce certain otherwise legal products, but that there are some exceptions for standard-setting. Perhaps that is why the various inter-industry groups try to dress up their agreements as “standards.” As I have written before, most of these agreements don’t look at all like technical standards, and to label them as such is misleading.

True technical standards are voluntary, and allow products to be more functional by giving them a way to interoperate (i.e., to work together). Most of the DRM “standards” are mandatory, and make products less functional by banning some kinds of interoperation.

Whether these agreements violate antitrust law is beyond my expertise, but I do know that a reasonable exemption for technical standard-setting ought not to apply to them.

Programs vs. Data

Maximillian Dornseif asks how one can draw the line between programs and data. This is important because the law often treats the two differently. He concludes that no clear line can be drawn.

This is a more difficult question than non-techies might think. A key attribute of the “von Neumann architecture” of today’s computers is that programs and data are in the same memory, so the computer can create, store, and process programs using the same facilities that are used for data. (Princetonians like to point out that this approach, named for a Princeton person, drove out an inferior alternative known as the “Harvard architecture.”)

Some cases are easy. The English text of this paragraph is data. Microsoft Word, considered as a whole, is a program. But some cases are trickier. What about the formula typed into a cell of a spreadsheet? I would call that a program. What about the commands you type in your computer’s command-line interface? Also a program.

If Microsoft Word is a program, what about a Word document? It’s mostly data, but it may contain programs, in the form of Word macros. You can’t tell, just by double-clicking a Word document, whether it contains macros.

Adobe PostScript documents are a really interesting case, too. PostScript is the predecessor to the PDF format. PostScript describes the layout of a page by giving a computer program for drawing the page. The program might contain pieces like, “move up one inch, then draw a 3-pixel-wide, quarter-inch-long line to the right” or “move 0.1 inches to the right, then display a capital ‘A’ in 12-point Times-Roman font.” PostScript does more than this. It really is a full-fledged programming langauge, letting you define new commands and everything.

People did use the programming features of PostScript. For example, a calendar-printing program would do computations to figure out whether it was a leap year, and which day of the week each month started on. If you were dedicated enough, you could write a PostScript program to balance your checkbook.

People tend to think of PostScript documents like they think of PDF documents, as being passive data displayed on a page – and that is what they look like. But under the covers, a PostScript page is a computer program. A full account of PostScript requires that we consider PostScript documents to be both data and program. Calling them one or the other is misleading.

The lesson of this is that it is an oversimplification to say that an object must be either data or program. Like a Word document, a single object can contain both program and data. Like a PostScript document, it can be both. A naive “I know it when I see it” approach to distinguishing programs from data will not be accurate.

Wacky Biometrics

I heard a presentation today by an expert on biometric security devices. He mentioned two new biometric devices under development. The first one uses body odor, detecting the unique combination of chemicals by your body. The second one fits on a chair; you sit on it and it measures the unique shape and weight distribution of your rear end. What will they think of next?

Man vs. Machine

Chess whiz Garry Kasparov has started another match against an electronic opponent. Much has been made of the man vs. machine battle, with right-thinking humanists everywhere lining up on Kasparov’s side, supporting human intellect and determination against the cold, mechanical logic of the computer.

I’m rooting for the machine.

Kasparov’s performance at the chessboard is awe-inspiring – a true triumph of the intellect. You can’t help but admire what he represents.

But if you know much of anything about his opponent, you’ll realize that it too is a monument to human achievement. The computer, Deep Junior, was built and programmed by people, not machines. People, not machines, figured out how to make a computer play brilliant chess despite the computer’s pathetic inability to mimic Kasparov’s brain. Deep Junior is the culmination of decades of work by an army of anonymous engineers and researchers, each contributing a few brilliant ideas to the technological edifice that made Deep Junior possible. To me, that story is more exciting than Kasparov’s talent, and just as human.

The standard knock on chess computers is that they are brainless and succeed only by brute force. As a one-time computer chess researcher, I can assure you that that image is misleading. Computers lack Kasparov’s intuition about chess, so they have to look far into the possible sequences of moves and countermoves. But blind exploration of all possible move sequences fails miserably due to the exponentially large number of possibilities. Instead, the best computer chess players are fantastically clever about where to apply their limited knowledge, about which move sequences to explore and in which order to explore them. The authors of these programs have collectively invented a new way to think about chess, one that focuses not on seeing as deeply as Kasparov but on knowing where to look. Considered on its own terms, this is at least as impressive as what Kasparov has done.

If you knew the people who have worked all of this out, and if you had seen their struggle, you might just root for Deep Junior too.