January 11, 2025

Advice for New Graduate Students

[Ed Felten says: This is the time of year when professors offer advice to new students. My colleague Prof. Jennifer Rexford gave a great talk to a group of our incoming engineering Ph.D. students, about how to make the most of graduate school. Here’s what she said: ]

Those of you who know me, know that I collect quotations as a hobby. (The short version of the story is that I moved around a lot as a kid. Quotations are small and very portable, making them a good hobby.) Anyway, two eminent scientists, Albert Einstein and Lewis Thomas, who were at Princeton the 1930s both have something interesting (and seemingly contradictory) to say about the role of the individual:

Albert Einstein (physics): “All that is valuable in human society depends upon the opportunity for development accorded the individual.”

Lewis Thomas (medicine, biology, Princeton class of ’33): “There is really no such creature as a single individual; he has no more life of his own than a cast-off cell marooned from the surface of your skin.”

These two quotations embody so much of what graduate school is all about.

Individual Development

First, graduate school is a highly individual experience. Compared to the somewhat anonymous experience of college — where you sit in large classes, do the same homeworks, and take the same tests with many other students — graduate school is a highly personal. Nobody else is doing quite the same research you are doing (or at least you hope they are not), and you get direct (sometimes pointed) feedback on your individual work — from your advisor, from your peers, and from reviewers of the papers you submit and the talks you give. And when your work is good but not great, you don’t just take the A- and move on to the next assignment — you keep plugging away and get more feedback and, eventually, you nail it. This is an amazingly efficient way to learn, grow, and create great scholarship.

Yet, there is a downside. The critique of your work, however well-meaning and “good for you,” will sometimes feel relentless. It requires some toughening of the skin, and a delicate little dance to simultaneously be in love with your work (so you have the tenacity you need to always dig deeper) and yet have enough emotional distance to be able to take constructive criticism of how your work looks in its early stages. It’s not an easy balance to strike, and I’m sure all of us who do research still struggle with it. I know I do. This is one of the many ways in which grad school is as much as emotional challenge as it is an intellectual one.

Another important aspect of the “individual” in graduate school is to learn your research “taste.” You may not know it yet, but you are weird. You come to research problems with some peculiar sensibility that nobody else has. You are attracted to a certain kind of research problem — maybe a messy practical problem, or a sharply formulated (but very hard) theoretical problem, or something in between. You notice a certain kind of weakness or gap in other people’s research. You have a particular set of techniques or approaches to solving problems. Graduate school is a wonderful time to figure out what your “taste” is, so you can craft your own agenda for the technical problems you pursue in the years ahead.

So, then, graduate school really is the epitome of what Einstein called the “development of the individual.” And I hope during your time here, you get the kind of opportunities for individual development that you deserve. Experiences that will let you produce deeper scholarship that expands the base of knowledge in your fields, and become more accomplished at conveying new and sometimes complex ideas to others.

Part of a Group

Yet, for all of my blathering on about the individual, graduate school is also a collective experience. You are part of a research group, a department, a discipline, (for many of you) an engineering school, a graduate school, and student groups like GWISE.

I want to say a few words about your research group, because it is so important. Your officemates, and the other graduate students around you, are such an important part of your graduate school experience. Not only do they provide a sense of community, and a community that truly understands your experiences, though that is certainly important. But they also mentor you on topics small and large.

I had a great officemate, Jim, in graduate school. He took ten years to graduate, and had already been there seven years when I arrived. So, Jim knew everything about everything. He taught me an important lesson I value to this day — how to be efficient. He would sit at the next desk and admonishingly say, “Jen, I hear the sounds of repetitive keystrokes. Today you will learn Perl.” To be honest, it was kind of creepy at first, but Jim would watch out for me out of the corner of his eye. He taught me things that would save me time, leaving me with the time and energy I needed to tackle bigger and more interesting problems.

Your classmates will also provide wonderful moments of professional serendipity, random encounters over coffee or foosball that make you aware of a body of work you didn’t know about, or recognize a previously unappreciated connection between two disciplines. You may even become the match-maker for the faculty, bringing two professors together to collaborate because you see a connection in their research that they were unable to see. The chance encounters, the candid feedback on your research, the unplanned discussions about research taste and philosophy — these are all a great part of interacting with your group mates.

I must caution you, though, about an important enemy against this kind of informal interactions. The Internet. Okay, so my research focuses on the Internet, so it may seem strange for me to be so negative about it, but this is important so I’ll make an exception. The Internet makes it far too easy to work from home, or a cafe, or on the train, rather than in your office or lab with your peers. Your choice to work away from the office is, in fact, perfectly rational. Coming into the office has a defined cost, in terms of your time and (perhaps) having to get out of your pajamas and take a shower. And, all of this is in exchange for some vague, speculative benefit — that you might have a chance encounter that truly changes your research. And, frankly, in any one day, you probably won’t have a profound experience in your office, and your officemates may not even be in the same scholarly mood as you. But, I entreat you to go anyway.

And, I encourage you to have a broader sense of community with each other, whether in your departments, or the school of engineering, or in groups like this one. Not only for the professional serendipity — though that will happen. But for the friendship and support. Graduate school is fun but it is also hard, and sometimes frustrating, and having some balance in your life will make the whole experience more worthwhile.

In fact, for what it’s worth, I find the students in my group who are more engaged with other students and student groups often graduate sooner than the other students. They often are better at managing their time, working intensely and efficiently to leave space in their lives for their other pursuits. And, they are more comfortable reaching out to other students for help, whether for feedback on a paper or guidance on an analytical technique or a software tool. They know more about the peculiarities of the faculty, and how to work around them. And, for the students who are not native English speakers, the social interactions also have a side benefit of sharpening their English skills. Mastering a language is, frankly, pretty boring work. Socializing in English is a much more enjoyable way to learn the language than any formal study could ever be.

So, in closing, I do think that graduate school is an unusual experience, both highly individual (in your training and professional development) and highly collective (in how you are part of a research group, a discipline, and a larger community). I hope you find both aspects of your time here at Princeton rewarding, and that you also make time to give back to the next group of students who arrive at Princeton after you.

Understanding the HDCP Master Key Leak

On Monday, somebody posted online an array of numbers which purports to be the secret master key used by HDCP, a video encryption standard used in consumer electronics devices such as DVD players and TVs. I don’t know if the key is genuine, but let’s assume for the sake of discussion that it is. What does the leak imply for HDCP’s security? And what does the leak mean for the industry, and for consumers?

HDCP is used to protect high-def digital video signals “on the wire,” for example on the cable connecting your DVD player to your TV. HDCP is supposed to do two things: it encrypts the content so that it can’t be captured off the wire, and it allows each endpoint to verify that the other endpoint is an HDCP-licensed device. From a security standpoint, the key step in HDCP is the initial handshake, which establishes a shared secret key that will be used to encrypt communications between the two devices, and at the same time allows each device to verify that the other one is licensed.

As usual when crypto is involved, the starting point for understanding the system’s design is to think about the secret keys: how many there are, who knows them, and how they are used. HDCP has a single master key, which is supposed to be known only by the central HDCP authority. Each device has a public key, which isn’t a secret, and a private key, which only that device is supposed to know. There is a special key generation algorithm (“keygen” for short) that is used to generate private keys. Keygen uses the secret master key and a public key, to generate the unique private key that corresponds to that public key. Because keygen uses the secret master key, only the central authority can do keygen.

Each HDCP device (e.g., a DVD player) has baked into it a public key and the corresponding private key. To get those keys, the device’s manufacturer needs the help of the central authority, because only the central authority can do keygen to determine the device’s private key.

Now suppose that two devices, which we’ll call A and B, want to do a handshake. A sends its public key to B, and vice versa. Then each party combines its own private key with the other party’s public key, to get a shared secret key. This shared key is supposed to be secret—i.e., known only to A and B—because making the shared key requires having either A’s private key or B’s private key.

Note that A and B actually did different computations to get the shared secret. A combined A’s private key with B’s public key, while B combined B’s private key with A’s public key. If A and B did different computations, how do we know they ended up with the same value? The short answer is: because of the special mathematical properties of keygen. And the security of the scheme depends on this: if you have a private key that was made using keygen, then the HDCP handshake will “work” for you, in the sense that you’ll end up getting the same shared key as the party on the other end. But if you tried to use a random “private key” that you cooked up on your own, then the handshake won’t work: you’ll end up with a different shared key than the other device, so you won’t be able to talk to that device.

Now we can understand the implications of the master key leaking. Anyone who knows the master key can do keygen, so the leak allows everyone to do keygen. And this destroys both of the security properties that HDCP is supposed to provide. HDCP encryption is no longer effective because an eavesdropper who sees the initial handshake can use keygen to determine the parties’ private keys, thereby allowing the eavesdropper to determine the encryption key that protects the communication. HDCP no longer guarantees that participating devices are licensed, because a maker of unlicensed devices can use keygen to create mathematically correct public/private key pairs. In short, HDCP is now a dead letter, as far as security is concerned.

(It has been a dead letter, from a theoretical standpoint, for nearly a decade. A 2001 paper by Crosby et al. explained how the master secret could be reconstructed given a modest number of public/private key pairs. What Crosby predicted—a total defeat of HDCP—has now apparently come to pass.)

The impact of HDCP’s failure on consumers will probably be minor. The main practical effect of HDCP has been to create one more way in which your electronics could fail to work properly with your TV. This is unlikely to change. Mainstream electronics makers will probably continue to take HDCP licenses and to use HDCP as they are now. There might be some differences at the margin, where manufacturers feel they can take a few more liberties to make things work for their customers. HDCP has been less a security system than a tool for shaping the consumer electronics market, and that is unlikely to change.

Why did anybody believe Haystack?

Haystack, a hyped technology that claimed to help political dissidents hide their Internet traffic from their governments, has been pulled by its promoters after independent researchers got a chance to study it and found severe problems.

This should come as a surprise to nobody. Haystack exhibited the warning signs of security snake oil: the flamboyant, self-promoting front man; the extravagant security claims; the super-sophisticated secret formula that cannot be disclosed; the avoidance of independent evaluation. What’s most interesting to me is that many in the media, and some in Washington, believed the Haystack hype, despite the apparent lack of evidence that Haystack would actually protect dissidents.

Now come the recriminations.

Jillian York summarizes the depressing line of adulatory press stories about Haystack and its front man, Austin Heap.

Evgeny Morozov at Foreign Affairs, who has been skeptical of Haystack from the beginning, calls several Internet commentators (Zittrain, Palfrey, and Zuckerman) “irresponsible” for failing to criticize Haystack earlier. Certainly, Z, P, and Z could have raised questions about the rush to hype Haystack. But the tech policy world is brimming with overhyped claims, and it’s too much to expect pundits to denounce them all. Furthermore, although Z, P, and Z know a lot about the Internet, they don’t have the expertise to evaluate the technical question of whether Haystack users can be tracked — even assuming the evidence had been available.

Nancy Scola, at TechPresident, offers a more depressing take, implying that it’s virtually impossible for reporters to cover technology responsibly.

It takes real work for reporters and editors to vet tech stories; it’s not enough to fact check quotes, figures, and events. Even “seeing a copy [of the product],” as York puts it, isn’t enough. Projects like Haystack need to be checked-out by technologists in the know, and I’d argue the before the recent rise of techno-advocates like, say, Clay Johnson or Tom Lee, there weren’t obvious knowledgeable sources for even dedicated reporters to call to help them make sense of something like Haystack, on deadline and in English.

Note the weasel-word “obvious” in the last sentence — it’s not that qualified experts don’t exist, it’s just that, in Scola’s take, reporters can’t be bothered to find out who they are.

I don’t think things are as bad as Scola implies. We need to remember that the majority of tech reporters didn’t hype Haystack. Non-expert reporters should have known to be wary about Haystack, just based on healthy journalistic skepticism about bold claims made without evidence. I’ll bet that many of the more savvy reporters shied away from Haystack stories for just this reason. The problem is that the few who did not got undeserved attention.

[Update (Tue 14 Sept 2010): Nancy Scola responds, saying that her point was that reporters’ incentives are to avoid checking up too much on enticing-if-true stories such as Haystack. Fair enough. I didn’t mean to imply that she condoned this state of affairs, just that she was pointing out its existence.]

A Software License Agreement Takes it On the Chin

[Update: This post was featured on Slashdot.]

[Update: There are two discrete ways of asking whether a court decision is “correct.” The first is to ask: is the law being applied the same way here as it has been applied in other cases? We can call this first question the “legal question.” The second is to ask: what is the relevant social or policy goal from a normative standpoint (say, technological progress) and does the court decision advance that goal? We can call this second question “the policy question.” Eric Felten, who addressed my August 31st post at length in his article in the Wall Street Journal (Video Game Tort: You Made Me Play You), is clearly addressing the policy question. He describes “[t]he proliferation of annoying and obnoxious license agreements” as having great social utility because they prevent customers from “abusing” software companies. What Mr. Felten fails to grasp, however, is that I have not weighed in on the policy question at all. My point is much simpler. My point addressed only the legal question and set forth the (apparently controversial) proposition that courts should be faithful to the law. In the case of EULAs, that means applying the same standards, the same doctrines, and the same rules as the courts have applied to analogous consumer contracts in the brick and mortar world. Is that too much to ask? Apparently it was not too much to ask of the federal court in Smallwood, because that was exactly how the court proceeded. Mr. Felten’s only discussion of why the Smallwood decision may be legally incorrect involves the question of whether or not “physical” injury occurred. Although this is an interesting factual question with respect to the plaintiff’s “Negligent Infliction of Emotional Distress” claim (count 7), the court found it irrelevant with respect to the plain-old negligence and gross negligence claims (counts 4 and 5). These were the counts that my original blog post primarily addressed. It’s hard to parse Prof. Zittrain’s precise legal reasoning from the quotes in Mr. Felten’s article, but it’s possible that the two of us would agree on the law. In any event, Mr. Felten is content to basically bypass the legal questions and merely fulminate–superficially, I might add–on the policy question.]

The case law governing software license agreements has evolved dramatically over the past 20 years as cataloged by Doug Phillips in his book The Software License Unveiled. One of the recent trends in this evolution, as correctly noted by Phillips, is that courts will often honor contractual limitations of liability which appear in these agreements, which seek to insulate the software company from various claims and categories of damages, notwithstanding the lack of bargaining power on the part of the user. The case law has been animated, in large part, by the normative economics of Judges associated with the University of Chicago. Certain courts, as a result, could be fairly criticized as being institutionally hostile to the user public at large. Phillips notes that a New York appellate court, in Moore v. Microsoft Corp., 741 N.Y.S.2d 91 (N.Y. App. Div. 2002), went so far as to hold that a contractual limitation of liability barred pursuit of claims for deceptive trade practices. Although the general rule is that deceit-based claims, as well as intentional torts, cannot be contractually waived in advance, there are various doctrines, exceptions, and findings that a court might use (or misuse) to sidestep the general rule. Such rulings are unsurprising at this point, because the user, as chronicled by Phillips, has been dying a slow death under the decisional law, with software license agreements routinely interpreted in favor of software companies on any number of issues.

It was against this backdrop that, on August 4, 2010, a software company seeking to use a contractual limitation of liability as a basis to dismiss various tort claims, met with stunning defeat. The U.S. District Court for the District of Hawaii ruled that the plaintiff’s gross negligence claims could proceed against the software company and that the contractual limitation of liability did not foreclose a potential recovery of punitive damages based on such claims. Furthermore, the matter remains in federal court in Hawaii notwithstanding a forum selection clause (section 15 of the User Agreement) in which the user apparently agreed “that any action or proceeding instituted under this Agreement shall be brought only in State courts of Travis County, State of Texas.”

The case is Smallwood v. NCsoft Corp., and involved the massively multiplayer, subscription-based online fantasy roll-playing game “Lineage II.” The plaintiff, a subscriber, alleged that the software company failed to warn of the “danger of psychological dependence or addiction from continued play” and that he had suffered physically from an addiction to the game. The plaintiff reportedly played Lineage II for 20,000 hours from 2004 through 2009. (Is there any higher accolade for a gaming company?) The plaintiff also alleged that, in September of 2009, he was “locked out” and “banned” from the game. The plaintiff claimed that the software company had told him he was banned “for engaging in an elaborate scheme to create real money transfers.” The plaintiff, in his Second Amended Complaint, couched his claims against the software company in terms of 8 separate counts: (1) misrepresentation/deceit, (2) unfair and deceptive trace practices, (3) defamation/libel/slander, (4) negligence, (5) gross negligence, (6) intentional infliction of emotional distress, (7) negligent infliction of emotional distress and (8) punitive damages.

The software company undertook to stop the lawsuit dead in its tracks and filed a motion to dismiss all counts. The defendants argued, among other things, that Section 12 of the User Agreement, entitled “Limitation of Liability,” foreclosed essentially any recovery. The provision, which is common in the industry, purported to cap the amount of the software company’s liability at the amount of the user’s account fees, the price of additional features, or the amount paid by the user to the software company in the preceding six months, whichever was less. The provision also stated that it barred incidental, consequential, and punitive damages:

12. Limitation of Liability
* * *
IN NO EVENT SHALL NC INTERACTIVE . . . BE LIABLE TO YOU OR TO ANY
THIRD PARTY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL,
PUNITIVE OR EXEMPLARY DAMAGES . . . REGARDLESS OF THE THEORY
OF LIABILITY (INCLUDING CONTRACT, NEGLIGENCE, OR STRICT
LIABILITY) ARISING OUT OF OR IN CONNECTION WITH THE SERVICE,
THE SOFTWARE, YOUR ACCOUNT OR THIS AGREEMENT WHICH MAY BE
INCURRED BY YOU . . . .

The Court considered the parties’ arguments and then penned a whopping 49-page decision granting the software company’s motion to dismiss, but only partially. The Court determined that the User Agreement contained a valid “choice of law” provision stating that Texas law would govern the interpretation of the contract. However, the Court then ruled that both Texas and Hawaii law did not permit people to waive in advance their ability to make gross negligence claims. The plaintiff’s remaining negligence claims survived as well. The claims based on gross negligence remained viable for the full range of tort damages, including punitive damages, whereas the straight-up negligence-based claims would be subject to the contractually agreed on limitation on damages.

The fact that the gross negligence claims survived is significant in and of itself, but in reality having the right to sue for “gross negligence” is the functional equivalent of having the right to sue for straight-up negligence as well—thus radically broadening the scope of claims that (according to the court) cannot be waived in a User Agreement. Although it is true that negligence and gross negligence differ in theory (“negligence” = breach of the duty of ordinary care in the circumstances; “gross negligence” = conduct much worse than negligence), it is nearly impossible to pin down with precision the dividing line between the two concepts. Interestingly, Wikipedia notes that the Brits broadly distrust the concept of gross negligence and that, as far back as 1843, in Wilson v. Brett, Baron Rolfe “could see no difference between negligence and gross negligence; that it was the same thing, with the addition of a vituperative epithet.” True indeed.

The lack of a clear dividing line is an important tactical consideration. A plaintiff often pleads a single set of facts as supporting claims for both negligence and gross negligence and—in the absence of a contractual limitation on liability—expects both claims to survive a motion to dismiss, survive a motion for summary judgment, and make it to a jury. When the contractual limitation of liability is introduced into the mix, and the plaintiff is forced to give up the pure negligence claims, it hardly matters: the gross negligence claims—based on the exact same facts—cannot be waived (at least under Texas and Hawaii law) and therefore survive, at least up to the point of trial. Courts will not decide genuine factual disputes—that is the function of the jury. This is usually enough for the plaintiff, since the overwhelming majority of cases settle. Thus, a gross negligence claim, in most situations, is the functional equivalent of a negligence claim. For these reasons, the Smallwood decision, if it stands, may achieve some lasting significance in the software license wars.

Assessing PACER's Access Barriers

The U.S. Courts recently conducted a year-long assessment of their Electronic Public Access program which included a survey of PACER users. While the results of the assessment haven’t been formally published, the Third Branch Newsletter has an interview with Bankruptcy Judge J. Rich Leonard that discusses a few high-level findings of the survey. Judge Leonard has been heavily involved in shaping the evolution of PACER since its inception twenty years ago and continues to lead today.

The survey covered a wide range of PACER users—“the courts, the media, litigants, attorneys, researchers, and bulk data collectors”—and Judge Leonard claims they found “a remarkably high level of satisfaction”: around 80% of those surveyed were “satisfied” or “very satisfied” with the service.

If we compare public access before we had PACER to where we are now, there is clearly much success to celebrate. But the key question is not only whether current users are satisfied with the service but also whether PACER is reaching its entire audience of potential users. Are there artificial obstacles preventing potential PACER users—who admittedly would be difficult to poll—from using the service? The satisfaction statistic may be fine at face value, assuming that a representative sample of users were polled, but it could be misleading if it’s being used to gauge the overall success of PACER as a public access system.

One indicator of obstacles may be another statistic cited by Judge Leonard: “about 45% of PACER users also use CM/ECF,” the Courts’ electronic case management and filing system. To put it another way, nearly half of all PACER users are currently attorneys who practice federal law.

That number seems inordinately high to me and suggests that significant barriers to public access may exist. In particular, account registration requires all users to submit a valid credit card for billing (or alternatively a valid home address to receive log-in credentials and billing statements by mail.) Even if users’ credit cards are never charged, this registration hurdle may already turn away many potential PACER users at the door.

The other barrier is obviously the cost itself. With a few exceptions, users are forced to pay a fee for each document they download, at a metered rate of eight-cents per page. Judge Leonard asserts that “surprisingly, cost ranked way down” in the survey and that “most people thought they paid a fair price for what they got.”

But this doesn’t necessarily imply that cost isn’t a major impediment to access. It may just be that those surveyed—primarily lawyers—simply pass the cost of using PACER down to their clients and never bear the cost themselves. For the rest of PACER users who don’t have that luxury, the high cost of access can completely rule out certain kinds of legal research, or cause users to significantly ration and monitor their usage (as is the case even in the vast majority of our nation’s law libraries), or wholly deter users from ever using the service.

Judge Leonard rightly recognizes that it’s Congress that has authorized the collection of user fees, rather than using general taxpayer money, to fund the electronic public access program. But I wish the Courts would at least acknowledge that moving away from a fee-based model, to a system funded by general appropriations, would strengthen our judicial process and get us closer to securing each citizen’s right to equal protection under the law.

Rather than downplaying the barriers to public access, the Courts should work with Congress to establish a way forward to support a public access system that is truly open. They should study and report on the extent to which Congress already funds PACER indirectly, through Executive and Legislative branch PACER fee payments to the Judiciary, and re-appropriate those funds directly. If there is a funding shortfall, and I assume there will be, they should study the various options for closing that gap, such as additional direct appropriations or a slight increase in certain filing fees.

With our other two branches of government making great strides in openness and transparency with the help of technology, the Courts similarly needs to transition away from a one-size-fits-all approach to information dissemination. Public access to the courts will be fundamentally transformed by a vigorous culture of civic innovation around federal court documents, and this will only happen if the Courts confront today’s access barriers head-on and break them down.

(Thanks to Daniel Schuman for pointing me to the original article.)