On Tuesday Judge Denny Chen rejected a proposed settlement in the Google Book Search case. My write-up for Ars Technica is here.
The question everyone is asking is what comes next. The conventional wisdom seems to be that the parties will go back to the bargaining table and hammer out a third iteration of the settlement. It’s also possible that the parties will try to appeal the rejection of the current settlement. Still, in case anyone at Google is reading this, I’d like to make a pitch for the third option: litigate!
Google has long been at the forefront of efforts to shape copyright law in ways that encourage innovation. When the authors and publishers first sued Google back in 2005, I was quick to defend the scanning of books under copyright’s fair use doctrine. And I still think that position is correct.
Unfortunately, in 2008 Google saw an opportunity to make a separate truce with the publishing industry that placed Google at the center of the book business and left everyone else out in the cold. Because of the peculiarities of class action law, the settlement would have given Google the legal right to use hundreds of thousands of “orphan” works without actually getting permission from their copyright holders. Competitors who wanted the same deal would have had no realistic way of doing so. Googlers are a smart bunch, and so they took what was obviously a good deal for them even though it was bad for fair use and online innovation.
Now the deal is no longer on the table, and it’s not clear if it can be salvaged. Judge Chin suggested that he might approve a new, “opt-in” settlement. But switching to an opt-in rule would undermine the very thing that made the deal so appealing to Google in the first place: the freedom to incorporate works whose copyright status was unclear. Take that away, and it’s not clear that Google Book Search can exist at all.
Moreover, I think the failure of the settlement may strengthen Google’s fair use argument. Fair use exists as a kind of safety valve for the copyright system, to ensure that it does not damage free speech, innovation, and other values. Although formally speaking judges are supposed to run through the famous four factor test to determine what counts as a fair use, in practice an important factor is whether the judge perceives the defendant as having acted in good faith. Google has now spent three years looking for a way to build its Book Search project using something other than fair use, and come up empty. This underscores the stakes of the fair use fight: if Judge Chen ruled against Google’s fair use argument, it would mean that it was effectively impossible to build a book search engine as comprehensive as the one Google has built. That outcome doesn’t seem consistent with the constitution’s command that copyright promote the progress of science and the useful arts.
In any event, Google may not have much choice. If it signs an “opt-in” settlement with the Author’s Guild and the Association of American Publishers, it’s likely to face a fresh round of lawsuits from other copyright holders who aren’t members of those organizations — and they might not be as willing to settle for a token sum. So if Google thinks its fair use argument is a winner, it might as well test it now before it’s paid out any settlement money. And if it’s not, then this business might be too expensive for Google to be in at all.
The best outcome would be to allow anyone access to the digitized works.
It seems rather silly that if another company wants to get into this space, they have to replicate all of the manual labor again. I feel like the digitization of all works should be done by some organization like the LoC so we can all benefit from the work.
I don’t see what google is trying to do as a fair use. Sure, they can make copies of the text in order to construct the index of the books, but that isn’t all they’re doing.
When you search for a string, you see entire pages of books in search results. It is a very substantial part of the text of the book. It is not for the purpose of commentary or reportage, so google wouldn’t be able to claim fair use on those grounds. Moreover, it clearly could have an impact on the market value of the shown book.
What is the argument that what they’re doing is a fair use?
You don’t see “entire pages of books” in search results. That only happens with publisher permission. For in-copyright books for which Google hasn’t gotten permission from the copyright holder, it displays only a few lines of the text, similar to how other Google search products work,
If you have a concordance (a list of what “interesting” words appear next to what other words) rebuilding the entire text is straightforward. (This was first done with the Dead Sea Scrolls.) It’s like shotgun DNA sequencing, only with page numbers to keep you on track.
I’m surprised some information-science grad student hasn’t figured out how to automate the process for Google Books. Or perhaps they have, and it just hasn’t made news.
Google does some clever things to prevent this. IIRC, they limit the amount of text they’ll show to any given user, and they have a randomly-selected subset of the text that they never show in snippets.
But also, what would the point be? These books are all available for free in the public libraries, and the popular ones are on P2P networks. If you want a copy of the book, there are lots of ways to get one without paying that are less of a hassle than trying to extract them from Google’s search engine.
Speaking for myself, I agree with Judge Chin’s analysis. Copyright law is opt-in, not opt-out. Judge’s Chin’s decision provides a framework (1) to redraft the Settlement Agreement to apply only to past conduct, and (2) for Google to revise Google Books to an opt-in service.
A friend made an excellent suggestion, which is that so long as storage continues to be cheap, and so long as Google can afford to do so, Google can still scan every book out there without permission. Google would then store each book and not let that copy see the light of day until the sooner of (1) the copyright owner opts-in and permits Google to make copies available for research, online reading, copying, etc., or (2) the copyright expires (even if it takes life plus 70 years).
Your comments are welcome,
Fred Wilf
Um, copyrights don’t expire anymore, as long as every 20 years the Congress retroactively extends the term of copyrights by 20 years. Those orphan works will be orphans forever, if that keeps happening.
If there was a set of multibillion-dollar companies who would gain from their expiry the way that there’s a set of those companies who gain from the indefinite extensions. Mere public interest certainly hasn’t gotten very far.
Chin did not have to rule on Google’s defense. That was not in question in this rejection of the settlement.
But I have to take issue with your premise that Google even has a decent fair use defense. Google has a risky fair use defense. It depends on the court agreeing that Google should be able to impose the copyright norms of the Web onto the real world. And it depends on the Court ignoring the creation of the copy of the entire work for use in enhancing Google’s most profitable venture: search. Even after all that, the fact that Google made a copy of every work for a purpose completely outside the transformative use if book search — paying off libraries for access to collections — undermines everything. Google is sure to lose in court if it fights on. Google Books is unlikely to generate enough revenue in the short term to justify the cost of the fight, the risk to itself, and the risk to fair use.
If you truly believe in fair use, you should not urge Google to make a stand for its own interests. Even if I am wrong about fair use, Congress is sure to ride in and act on behest of News Corp., Time-Warner, Bertelsman, etc. and correct for the ruling. Congress hates Google right now.
Despite that anti-Google bias in American politics right now, the legislature is the only answer, as unlikely as success may seem. Fair use is not designed to create policy by judicial fiat.
So the grown-up answer to this problem is that if we want stuff like this we have to mount political efforts to secure it. If we want project like this from institutions and firm like and unlike Google, we should push for legislative changes to copyright. Short of that we are just cheating — just like Google did with the settlement.
I don’t understand what you mean by “imposing the copyright norms of the Web onto the real world.” There isn’t one body of copyright law for the web and another for everything else. We have good law saying that indexing content for use in a search engine is fair use. Obviously, the facts aren’t exactly the same, and so there’s no guarantee that the courts will come to the same conclusion about print books, but it seems like a pretty natural extension of that line of cases to me.
Also, frankly, Google is an ideal plaintiff for a lawsuit like this. They’re a large company with a great brand and a top-flight legal team. And the product at issue here is used by millions of people. A judge is going to be pretty reluctant to issue a ruling that forces Google to turn Google Books off. Of course they still might lose, and it’s easy for me as a non-Google shareholder to ask them to take that risk. But Google has always considered itself a company willing to take risks, and I hope they take one more.
Name-calling aside, there’s nothing childish about asking judges to use the discretion Congress has given them to make the law better. That’s how every positive development in copyright law (Betamax, Diamond, Arribasoft, etc) has occurred over the last half century. I’ll certainly sign up to help lobby Congress on behalf of copyright reform, but I doubt that’ll work and in any event it’s not a reason to abandon other fronts.
Section 512 of the 1998 Digital Millennium Copyright Act created a copyright infringement exception for, among other things, search engines indexing web content as long as they allowed a standard opt-out procedure and followed certain rules when told of infringing content. Not sure if/how that section would apply to scanning of books but, on first blus,h I don’t see how it could at all.
We have good law that says that indexing a copyrighted work does not infringe copyright. Period. If not, libraries could not exist.
Change the copyright laws. Require explicit renewal of copyright on all outstanding works, the way the law used to be. If no owner comes forth and provides evidence of ownership and pays a renewal fee the works should automatically go into the public domain.
Just turn it off.
Turn off Book Search.
Say the state of copyright law is too screwed for them to provide the service any more.
Let people demand they bring it back.
The people ought to be asking Congress for copyright reform. Having everyone work out deals in court is the wrong answer. Maybe Google can afford all this nonsense but us little people can’t.
My prediction is that the settlement will be opt-out as to scanning and searching, and opt-in as to full-text, thus protecting Google from copycat lawsuits (which would need to seize control of this case from its current counsel).