October 15, 2024

Apple's File Labeling: An Effective Anticopying Tool?

Recently it was revealed that Apple’s new DRM-free iTunes tracks come with the buyer’s name encoded in their headers. Randy Picker suggested that this might be designed to deter copying – if you redistribute a file you bought, your name would be all over it. It would be easy for Apple, or a copyright owner, to identify the culprit. Or so the theory goes.

Fred von Lohmann responded, suggesting that Apple should have encrypted the information, to protect privacy while still allowing Apple to identify the original buyer if necessary. Randy responded that there was a benefit to letting third parties do enforcement.

More interesting than the lack of encryption is the apparent lack of integrity checks on the data. This makes it pretty easy to change the name in a file. Fred predicts that somebody will make a tool for changing the name to “Steve Jobs” or something. Worse yet, it would be easy to change the data in a file to frame an innocent person – which makes the name information pretty much useless for enforcement.

If you’re not a crypto person, you may not realize that there are different tools for keeping information secret than for detecting tampering – in the lingo, different tools for ensuring confidentiality than for ensuring integrity.

[UPDATE (June 7): I originally wrote that Apple had apparently not put integrity checks in the files. That now appears to be wrong, so I have rewritten this post a bit.]

Apple apparently used crypto to protect the integrity of the data. Done right, this would let Apple detect whether the name information in a file was accurate. (You might worry that somebody could transplant the name header from one file to another, but proper crypto will detect that.) Whether to use this kind of integrity check is a separate question from whether to encrypt the information – you can do either, or both, or neither.

From a security standpoint, the best way to do guarantee integrity in this case is to digitally sign the name data, using a key known only to Apple. There’s a separate key used for verifying that the data hasn’t been modified. Apple could choose to publish this verification key if they wanted to let third parties verify the name information in files.

But there’s another problem – and a pretty big one. All a digital signature can do is verify that a file is the same one that was sold to a particular customer. If a file is swiped from a customer’s machine and then distributed, you’ll know where the file came from but you won’t know who is at fault. This scenario is very plausible, given that as many as 10% of the machines on the Net contain bot software that could easily be directed to swipe iTunes files.

Which brings us to the usual problem with systems that try to label files and punish people whose labels appear on infringing files. If these people are punished severely, the result will be unfair and no prudent person will buy and keep the labeled files. If punishments are mild, then users might be willing to distribute their own files and claim innocence if they’re caught. It’s unlikely that we could reliably tell the difference between a scofflaw user and one victimized by malware, so there seems to be no escape from this problem.

Why So Much Attention to "What's a Website?" Judge?

One of the benefits of talking to the press is that reporters often ask thought-provoking questions. Recently Noam Cohen, a New York Times columnist, called and asked me why the Net community gets so excited when a public figure professes ignorance about the Net. It’s natural for people to chuckle at Ted “Tubes” Stevens or George “Internets” Bush; but why devote so much e-ink to them? This was the topic of Mr. Cohen’s latest column, which quotes part of our conversation.

The latest victim of Net outrage was a British high court judge, Peter Openshaw, who reportedly said during a trial, “The trouble is, I don’t understand the language. I don’t really understand what a web site is.” Predictably, the Net responded with derision.

Like most folk tales, the Technologically Ignorant Policymaker story has legs because it connects to a deeply felt concern of the community. In this case, it’s the worry of Net folk that policymakers will cluelessly cripple the Net. One ill-considered comment is not by itself a big deal, but it becomes a symbol of a broader problem.

It’s worth noting, too, that in the case of Stevens and Bush the storyline resonates with the speaker’s reputations – fairly or not, neither Stevens nor Bush is thought to be particularly curious or well-informed as policymakers go. Fewer people know about Judge Openshaw, but his comment must have resonated with concerns about judges in general.

Though cathartic for Net folk, these incidents do have a down side. The next time a judge or policymaker hears technical jargon he doesn’t understand, he’ll be a bit less likely to ask for a clarification. And it’s better to ask a question and learn the answer than to stay in the dark.

The Slingbox Pro: Information Leakage and Variable Bitrate (VBR) Fingerprints

[Today’s guest blogger is Yoshi Kohno, a Computer Science prof at University of Washington who has done interesting work on security and privacy topics including e-voting. – Ed]

If you follow technology news, you might be aware of the buzz surrounding technologies that mate the Internet with your TV. The Slingbox Pro and the Apple TV are two commercial products leading this wave. The Slingbox Pro and the Apple TV system are a bit different, but the basic idea is that they can stream videos over a network. For example, you could hook the Slingbox Pro up to your DVD player or cable TV box, and then wirelessly watch a movie on any TV in your house (via the announced Sling Catcher). Or you could watch a movie or TV show on your laptop from across the world.

Privacy is important for these technologies. For example, you probably don’t want someone sniffing at your ISP to figure out that you’re watching a pirated copy of Spiderman 3 (of course, we don’t condone piracy). You might not want your neighbor, who likes to sniff 802.11 wireless packets, to be able to figure out what channel, movie, or type of movie you’re watching. You might not want your hotel to figure out what movie you’re watching on your laptop in order to send you targeted ads. The list goes on…

To address viewer privacy, the Slingbox Pro uses encryption. But does the use of encryption fully protect the privacy of a user’s viewing habits? We studied this question at the University of Washington, and we found that the answer to this questions is No – despite the use of encryption, a passive eavesdropper can still learn private information about what someone is watching via their Slingbox Pro.

The full details of our results are in our Usenix Security 2007 paper, but here are some of the highlights.

First, in order to conserve bandwidth, the Slingbox Pro uses something called variable bitrate (VBR) encoding. VBR is a standard approach for compressing streaming multimedia. At a very abstract level, the idea is to only transmit the differences between frames. This means that if a scene changes rapidly, the Slingbox Pro must still transmit a lot of data. But if the scene changes slowly, the Slingbox Pro will only have to transmit a small amount of data – a great bandwidth saver.

Now notice that different movies have different visual effects (e.g., some movies have frequent and rapid scene changes, others don’t). The use of VBR encodings therefore means that the amount data transmitted over time can serve as a fingerprint for a movie. And, since encryption alone won’t fully conceal the number of bytes transmitted, this fingerprint can survive encryption!

We experimented with fingerprinting encrypted Slingbox Pro movie transmissions in our lab. We took 26 of our favorite movies (we tried to pick movies from the same director, or multiple movies in a series), and we played them over our Slingbox Pro. Sometimes we streamed them to a laptop attached to a wired network, and sometimes we streamed them to a laptop connected to an 802.11 wireless network. In all cases the laptop was one hop away.

We trained our system on some of those traces. We then took new query traces for these movies and tried to match them to our database. For over half of the movies, we were able to correctly identify the movie over 98% of the time. This is well above the less than 4% accuracy that one would get by random chance.

What does all this mean? First and foremost, this research result provides further evidence that critical information can leak out through encrypted channels; see our paper for related work. In the case of encrypted streaming multimedia, one might wonder how our results scale since we only tested 26 movies. Addressing the scalability question for our new VBR-based fingerprinting approach is a subject of future research; but, as cryptanalysts like to say, attacks only get better. Moreover, if the makers of movies wanted to, they could potentially make the VBR fingerprints for their movies even stronger and more uniquely identifying.

(This note is not meant to criticize the makers of the Slingbox Pro. In fact, we were very pleased to learn that the Slingbox Pro uses encryption, which does raise the bar against a privacy attacker. Rather, this note describes new research results and fundamental challenges for privacy and streaming multimedia.)

Finnish Court: Okay to Circumvent DVD DRM

A court in Finland ruled last week that it is not a violation of that nation’s anticircumvention law to circumvent CSS, the copy protection system in DVDs. Mikko Välimäki, one of the defense lawyers, has the best explanation I’ve seen.

Finnish law bans the circumvention of “effective” DRM (copy protection) technologies. The court ruled that CSS is not effective, because CSS-defeating tools are so widely available to consumers.

The case is an interesting illustration of the importance of word choice and definitions in lawmaking. The WIPO copyright treaty required signatory nations to pass laws providing “effective legal remedies against the circumvention of effective technological measures that are used by authors in connection with the exercise of the rights …” Reading this, one can’t help but notice that the same word “effective” describes both the remedies and the measures. The implication, to me at least, is that the legal remedies only need to be as effective as the technological measures are.

The Finnish law implementing the treaty took the same approach. In language based on an EU Copyright Directive, the Finnish law defined an effective technology as one that “achieves the protection objective” (according to Mr. Välimäki’s translation). The court ruled that that doesn’t require absolute, 100% protection, but it does require some baseline level of effectiveness against casual circumvention by ordinary users. CSS did not meet this standard, the court said, so circumvention of CSS is lawful.

U.S. law took a different approach. The Digital Millennium Copyright Act (DMCA), the U.S. law supposedly implementing the WIPO treaty, bans circumvention of effective technological measures, but defines “effective” as follows:

a technological measure `effectively controls access to a work’ if the measure, in the ordinary course of its operation, requires the application of information, or a process or a treatment, with the authority of the copyright owner, to gain access to the work

Some courts have read this as protecting any DRM technology, no matter how lame. It has even been held to protect CSS despite its notoriously weak design. It’s even possible that the structure of the U.S. DMCA helped to ensure the weakness of CSS – but that’s a topic for another post.

One of the tricks I’ve learned in reading draft legislation is to look closely at the definitions, for that’s often where the action is. An odd or counterintuitive definition can morph a reasonable-sounding proposal into something else entirely. The definition of a little word like “effective” might be the difference between an overreaching law and a more moderate one.

Newsweek Ranks Schools; Monkey High Still Tops

Newsweek has once again issued its list of America’s Best High Schools. They’re using the same goofy formula as before: the number of students from a school who show up for AP or IB exams, divided by the number who graduate. Just showing up for an exam raises your school’s rating; graduating lowers your school’s rating.

As before, my hypothetical Monkey High is still the best high school in the universe. Monkey High has a strict admissions policy, allowing only monkeys to enroll. The monkeys are required to attend AP and IB exams; but they learn nothing and thus fail to graduate. Monkey High has an infinite rating on Newsweek’s scale.

Also as before, Newsweek excludes selective schools whose students have high SAT scores. Several such schools appear on a special list, with the mind-bending caption “Newsweek excluded these high performers from the list of America’s Best High Schools because so many of their students score well above the average on the SAT and ACT.” Some of these schools were relegated to the same list last year – and still, they’re not even trying to lower their SAT scores!

Newsweek’s FAQ tries to defend the formula, but actually only argues that it’s good for more students to take challenging courses. True, but that’s not what Newsweek measures. They also quote some studies, which don’t support their formula [emphasis added]:

Studies by U.S. Department of Education senior researcher Clifford Adelman in 1999 and 2005 showed that the best predictors of college graduation were not good high-school grades or test scores, but whether or not a student had an intense academic experience in high school. Such experiences were produced by taking higher-level math and English courses and struggling with the demands of college-level courses like AP or IB. Two recent studies looked at more than 150,000 students in California and Texas and found if they had passing scores on AP exams they were more likely to do well academically in college.

Worst of all, if parents pay attention to the Newsweek rankings, schools will have an incentive to maximize their scores, which they can do in three ways: (1) force more students to show up for AP/IB exams, whether or not they are academically prepared, (2) avoid having high SAT scores, (3) lower the school’s graduation rate, or at least don’t try too hard to raise it.

When asked why they publishing rankings at all, the FAQ’s answer includes this:

I am mildly ashamed of my reason for ranking, but I do it anyway. I want people to pay attention to this issue, because I think it is vitally important for the improvement of American high schools. Like most journalists, I learned long ago that we are tribal primates with a deep commitment to pecking orders.

As Monkey High principal, I agree wholeheartedly.