Books Digitization and Demand

by | Feb 26, 2019

A new paper from Nagaraj (UC Berkeley) and Reimers (Northeastern) called “Digitization and the Demand for Physical Works: Evidence from the Google Books Project” (revised April 2019) is said to demonstrate that the unauthorized scanning or “digitization” of entire book collections results in increased demand for “physical works” (particularly specialized works) and thus demonstrates a lack of harm and positive market benefits for authors and publishers. My post here notes that the motivation of authors and publishers in the 2000’s in bringing suit against the Google Books Project was not to inhibit an e-book market, given that publishers were already actively digitizing and making e-books available, but to ensure that Google was not itself allowed to create its own e-book market without authorization from publishers and authors.

It is always important to start with semantics—Nagaraj and Reimers talk of “digitization” but they mean two things—first the initial scanning of the Library book collections and the creation of digital files of the scanned books—and secondly the creation of a search engine layer or index from that corpus. Their point would be clearer if they clarified that what their research suggests is the value of increased awareness and discovery of somewhat specialized materials through a search engine index. While books have always been indexed by professional catalogers and publishers such as Bowker”s (Books in Print), the focus of these products has usually been on actively available and actively marketed books, and BiP is something of a specialized resource. There have been resources online for more than a decade or so about rare, unique and out of print books, some of which are now commercialized through Amazon, but there is no question that more information about more scholarly works is a net positive.

E-book readers began in 1998 but took off significantly in the 2000’s.  The initial players (e.g. Sony) focused almost exclusively on current popular fiction, but Amazon and the other more specialized online resources began offering a much expanded index and catalog by the mid-2000’s.  As Amazon began to combine indexes of some of the more specialized services, it became more of a rival to Google’s project at least in terms of creating visibility for books beyond current popular trade titles. It must be said however that Google did not have to engage in its unilateral scanning program to create an index of books—this concept of a search and resource identification layer was something, I believe, of an unintended consequence (I was involved in some of the negotiations for a short period of time representing specialized publishers)—what Google was intent on creating was a proprietary online market for digitized content. In contrast, Amazon achieves much of this by negotiating and licensing rights from authors and publishers.

Authors and publishers are not for or against any one model of distribution. The e-book market has become incredibly important, and for many years helped stabilize sales decreases on the print side. Ironically print sales have been increasing in recent years, although not as much as the increase in audio book sales.  In any event it is wrong to characterize the legal concerns of authors and publishers as being about propping up print book sales—the primary concern was ensuring a level playing field for e-book distribution. Clearly, Google providing substitute free e-book products (or at prices they would set) would be both a copyright violation (as the courts made clear when they mentioned, in their factual foundation, that Google was not actually providing the full-text to users) as well as a major market disruption.

The legal complaints filed against Google by authors and publishers were about Google’s intent to create a permission-free market for e-book content in a Google online marketplace, and the negotiations and draft settlement agreement which was quashed by the courts dealt with how such a market might be organized, to permit greater engagement by authors and publishers than the Google project had permitted up to that time (the “claiming” process that occupied much of the agreement), and the creation of a Books Registry that would be run by an independent entity. When the courts made this “class action” market approach impossible, then Google narrowed its approach to the 3 areas where they have been engaged in the past 10 years or so, namely: working with Hathi Trust and others on accessible files; providing some research capabilities online for scholars; and in providing a form of discovery or awareness tool for users. It is this latter function that Nagaraj and Reimers are really addressing in their paper. The death of the Google Library project has been reported elsewhere[i], but in any event it seems clear that once Google’s marketplace ambition was stymied, then Google was far less interested in or committed to its Library project.

Nagaraj and Reimers assert that “Copyright holders are concerned about the possibility that digitized versions would serve as substitute for material in print…”  As noted above, authors and publishers were at the time the suit was initiated against Google heavily engaged in e-book production and creating an e-book market. It is possible that the Google project provided more incentive in this development, but the concern was not about print—it was about unauthorized use, print or online, without compensation. The authors go on to say that “digitization might be a win-win for consumers, publishers and authors…” but again this is in the context of awareness and discovery—which no-one would dispute. More consumers becoming more aware of more book titles is indeed a positive for authors and publishers, something that authors and publishers work hard on every day.

Nagaraj and Reimers characterize the Google books litigation as being about browsing information about books—and whether this was a fair use—but as noted the legal complaints against Google were about making the resulting book files available as a market function.

I did find the point about awareness leading to greater sales of more specialized books an interesting observation—although this might be related to the fact that more popular books are indexed more widely and broadly through other means—but it is a point worth celebrating if there are increased sales of presumably more scholarly works.