April 24, 2014

avatar

Google Print, Damages and Incentives

There’s been lots of discussion online of this week’s lawsuit filed against Google by a group of authors, over the Google Print project. Google Print is scanning in books from four large libraries, indexing the books’ contents, and letting people do Google-style searches on the books’ contents. Search results show short snippets from the books, but won’t let users extract long portions. Google will withdraw any book from the program at the request of the copyright holder. As I understand it, scanning was already underway when the suit was filed.

The authors claim that scanning the books violates their copyright. Google claims the project is fair use. Everybody agrees that Google Print is a cool project that will benefit the public – but it might be illegal anyway.

Expert commentators disagree about the merits of the case. Jonathan Band thinks Google should win. William Patry thinks the authors should win. Who am I to argue with either of them? The bottom line is that nobody knows what will happen.

So Google was taking a risk by starting the project. The risk is larger than you might think, because if Google loses, it won’t just have to reimburse the authors for the economic harm they have suffered. Instead, Google will have to pay statutory damages of up to $30,000 for every book that has been scanned. That adds up quickly! (I don’t know how many books Google has scanned so far, but I assume it’s a nontrivial numer.)

You might wonder why copyright law imposes such a high penalty for an act – scanning one book – that causes relatively little harm. It’s a good question. If Google loses, it makes economic sense to make Google pay for the harm it has caused (and to impose an injunction against future scanning). This gives Google the right incentive, to weigh the expected cost of harm to the authors against the project’s overall value.

Imposing statutory damages makes technologists like Google too cautious. Even if a new technology creates great value while doing little harm, and the technologist has a strong (but not slam-dunk) fair use case, the risk of statutory damages may deter the technology’s release. That’s inefficient.

Some iffy technologies should be deterred, if they create relatively little value for the harm they do, or if the technologist has a weak fair use case. But statutory damages deter too many new technologies.

[Law and economics mavens may object that under some conditions it is efficient to impose higher damages. That's true, but I don't think those conditions apply here. I don't have space to address this point further, but please feel free to discuss it in the comments.]

In light of the risk Google is facing, it’s surprising that Google went ahead with the project. Maybe Google will decide now that discretion is the better part of valor, and will settle the case, stopping Google Print in exchange for the withdrawal of the lawsuit.

The good news, in the long run at least, is that this case will remind policymakers of the value of a robust fair use privilege.

Comments

  1. bill says:

    Patry has just updated his blog with a more “pro” stance. He states that Google is doing no harm, and talks a fair bit about fair use.

  2. Seth Finkelstein says:
  3. John Costello says:

    “Search results show short snippets from the books, but won’t let users extract long portions.”

    A little note about this…The search results for any particular search will only allow you too see a page or two but as long as you can see the index or table of contents, you can just search for certain words and it will allow you to see more pages. Using that trick, you can easily read an entire book.

  4. Cog says:

    John, Google’s example search results don’t even show a complete page. Piecing together the fragments is still technically possible, but presents at least a significantly more expensive computational problem.

    Furthermore, there are straightforward countermeasures Google can take (e.g., randomly selecting a subset of every book’s content that would never be returned as a search result) that would make the assembly of complete books impossible.

    Partially-redacted books might still have some value for criminals, but at this point we’re really stretching the notion of Google’s aiding criminal behavior.

    In any case, since when should a technology’s legality be determined by the use that very clever malicious criminals might make of it? Most libraries are terribly insecure; a criminal could easily bring a portable digital scanner into a library and thereby make a copy of anything there. Should libraries be illegal for this reason?

  5. Fred von Lohmann says:

    John is incorrect. As made clear in the link provided by Ed in the main poast, the search results for works where the owner has not consented will NOT show a page or two. It will be restricted to a sentence or two, and will only show three results for any particular work (to stop those who would just use “the” as a search term to get all the sentences, presumably).

    John is thinking about the Google Print Publisher program, for owners who have entered into agreements with Google. Those books will return more extensive results.

    This is not to say it would be impossible to reconstitute a book using some sophisticated attack on Google Print. But it would be nontrivial (and would presumably result in Google changing its protection mechanisms).

  6. Andreas Bovens says:

    “[...] you can easily read an entire book.”

    That’s not correct.
    If you try this with, say, Lawrence Lessig’s “Free Culture” (http://print.google.com/print?id=cxZp0sV3V80C), you bump into a Google Account login screen after 6 pages or so. The Google Print FAQ (http://print.google.com/googleprint/help.html#whylogin) explains: “Because many of the books in Google Print are still under copyright, you’ll see a limited amount of these books. To help us enforce these limits, some pages are available only after you log in [...]”
    Also from the FAQ: “As part of our efforts to protect a book’s copyright, a set of pages in every in-copyright book will be unavailable to all users.”

  7. John Costello says:

    Fred, I was unaware of any agreements that Google has had with any authors. I just went to print.google.com and searched a few books. My mistake.

  8. Steve Purpura says:

    Do not forget to add to the calculus of Google’s decision that they have a lot of cash and the cash can be used to influence the copyright law. Copyright law is written as a consensus of industry interests and I have little doubt that Google’s actions — and the fact that technology makes it possible — will yield a minor revolution.

  9. Karl-Friedrich Lenz says:

    I made the same mistake as John Costello above when I first blogged this lawsuit yesterday. It seems to be a mistake easily made. Even Larry Lessig confused “Google Print” and the subset “Google Print Library Project”.

    However, while it is probably true that the whole text is not displayed to individual users, it is to the totality of users.

    As Andreas Bovens reported above, there seem to be “some pages” not available for all users as a part of Google’s “efforts to protect a book’s copyright”.

    When asking about “the amount and substantiality of the portion used in relation to the copyrighted work as a whole” in this context, this is the relevant question. Exactly how many percent of the pages are not displayed to any user?

    We are talking about Google’s use of the work here, not about that of individual Google users. If Google is serving all pages except a few, Google uses all those pages, even if individual users get only a couple of lines per search.

  10. Saar Drimer says:

    Even if it was technically possible to reconstruct a book (it obviously is very hard nearing impossible according to the comments above) it would need to be just slightly more difficult than going to the library (plus maybe photo copying the book.) I think the authors are just thinking they can get some big bucks from google while knowing that the project benefits them tremendously. I would find it hard to believe that google went ahead with this project not knowing it will have the upper hand.

  11. paul says:

    I think Karl-Friedrich Lenz has it right with respect to the arguments of “fair use” — Google’s clients may not be using the whole text (although that is what smart programming and search APIs are for) but Google certainly is, and they’re doing it for a commercial purpose (marketing now, revenue when the project comes out of beta). Given the judgements that have been rendered for photocopying of book chapters and articles, this doesn’t seem even particularly close to me. (It’s also notable that for one of the prototypical article-copying cases, the act of filing journal articles in a corporate library was considered a sufficiently commercial action to trigger sanctions, regardless of who, if anyone, subsequently read the copies or any part thereof.)

    As a writer, I’m much more sanguine about the notion of statutory damages. First, for a company with Google’s capitalization that’s chump change even if some settlement can’t be reached. Second, it’s unreasonable to expect a plaintiff to prove actual damages (where are they going to get the money to survey libraries and bookstores to estimate lost sales of each infringed work and related works, much less to determine Google’s likely revenue from the project and their share of same?). If all copyright plaintiffs had to prove actual damages, individually-owned copyright would be pretty much a dead letter.

  12. Andis Kaulins says:

    If I understand the facts of this case correctly, Google has a contract with the Library of Michigan (and the four other libraries) to scan their collections. Google provides the money and the technology. As I post at LawPundit, is then the actual “copier” as a matter of law then actually THE LIBRARY and not Google? According to these contracts, part of Google’s remuneration for the technical making of these scans and digitizing them (i.e. copies) FOR THE LIBRARY is that Google can use the resulting database and offer it in snippet form (for copyrighted works) on its search engine. In my view, Author’s Guild has put the cart before the horse and has sued the wrong party at this stage of the game. The COPIER is – as a matter of law, because it is their collection and because copies are being made according to a contract they have made – the LIBRARY. The subsequent USER of these copies is Google. But that is a completely different legal constellation than what everyone is commenting on at the moment.

  13. Me says:

    The fact is: Google is providing copyrighted material WITHOUT PERMISSION of the copyright holder. They can hide behind the Fair Use clause of copyright law, but by broadcasting material owned by copyright holders who have not granted reprint privileges, they are entering a dangerous grey area. Broadcasting online is the same as PUBLISHING as far as we and many other authors/artists are concerned. What Google is doing is not appropriate, whether legal or not: Whatever happened to good, old-fashioned MANNERS? Can’t they ask first? Can’t they provide a mechanism for people to self-select, if they want to participate?

    They should have an OPT-IN procedure for the authors/artists who desire to have their works coughed up by Google search engines, rather than forcing us, once again, to OPT OUT of heavyhanded procedures some selfish self-serving enterprise has dreamed up. There is entirely too much OPTING OUT demanded of American citizens in our everyday life; and now to have our creative/intellectual properties OPTED INTO a search engine without our permission/knowledge is just too much.

    We love Google, but we wish they would reconsider this whole print.google.com scheme.