The Case Against Google Books

How three East Bay librarians led the revolt against the company's plans to archive all earthly knowledge.

Google has empowered ordinary citizens beyond anything once thought possible. Thanks to Google, you can view your home from space, track flu outbreaks around the country, or figure out traffic congestion along your commute route, all instantaneously and without spending a single dime. And the company’s motto, “Don’t Be Evil,” underscores an undeniable civic-mindedness. Its nonprofit arm, Google.org, invests in green electricity start-ups. Goats graze the company’s lawns to reduce its carbon footprint. And yet, despite the company’s best efforts, there’s just something about Google that gives some people the heebie-jeebies.

There’s its history of collaborating with repressive Chinese authorities, of course. There’s its habit of tracking what you search for and storing the data for months, or photographing citizens walking the streets and posting the pictures on its Street View site. Mostly, however, there’s its sheer size and power. Google is sitting on $19 billion in cash. Its business model is predicated upon ensuring that you come to rely upon it for almost everything you do online. Google’s success has forever changed the media, among other industries. When Gmail and Google Docs crash, as they have numerous times in the last two years, businesses around the world grind to a halt. Critics of the company worry that no single entity should have that much influence over our lives.

Five years ago, Google began one of the most ambitious projects in the history of human endeavor. Working with universities around the world, including UC Berkeley, the company systematically scanned and digitally archived millions of books. Today, it has come remarkably close to preserving and organizing every single idea, fact, calumny, sonnet, and law human beings have ever committed to print in English. “Google Book Search was launched as part of our mission to take the world’s information, organize it, and make it universally accessible,” says company spokeswoman Jennie Johnson. “Most of the world’s information is not online. It’s offline, in books, on shelves. Google Books began as a project to make books as discoverable as the world is today.”

It’s a remarkable resource, one that could make the sum total of the world’s knowledge immediately available to the most isolated researcher or the simply curious. And yet …

Twelve months ago, three East Bay academics slowly began to grow uncomfortable with what Google was doing. The more they looked into the details of the Google Books project, the more they began to conclude that the country could not afford to let Google control humanity’s knowledge the way it intended to. For their own individual reasons, in their own distinctive ways, these critics — Peter Brantley, Pamela Samuelson, and linguist Geoff Nunberg — set out to stop the project, or at least fundamentally change the way it was being carried out.

Last month, they watched as the Google Books project stalled in its tracks. The Authors Guild and a group of publishers had sued Google for copyright infringement, and the three parties had worked out a settlement. As soon as a federal judge approved the settlement, the project could proceed. But in September, the Justice Department issued a key opinion arguing that elements of this settlement violated the country’s antitrust laws, seriously jeopardizing its chances of passing muster in federal court. Google is now reworking its deal with authors and publishers, and its grand scheme will now almost surely be impossible without sweeping changes.

It’s impossible to know just how much Brantley, Samuelson, and Nunberg influenced the Justice Department’s opinion. But these three academics forged a powerful coalition out of the country’s academics, research libraries, consumer and privacy groups, as well as Google rivals Microsoft and Amazon.com. Along the way, they helped create a new skepticism toward Google’s once-pristine brand. And they sparked a national conversation about one of our most interesting and central questions: what is a free society obliged to do with its written words?


Google is, first and foremost, a search engine. And when founders Larry Page and Sergey Brin and CEO Eric Schmidt set out to digitize the contents of some of the English-speaking world’s greatest libraries, they apparently did so to increase the universe of knowledge their bots could scan, chop into discrete, digestible snippets, and present to their users, along with the usual targeted text ads. If along the way, entire libraries were digitally preserved, that could only benefit the world, right?

At least, that’s what the company’s library partners thought. In 2004, Google got permission from Stanford, Harvard, Oxford, the University of Michigan, and the New York Public Library system to archive their stacks. The University of California and other libraries later joined the project, and today, Google has digitally preserved a remarkable ten million tomes in its system.

The plan was simple: rather than display substantial sections of a book, which would clearly violate copyright law, Google would only display a paragraph or two, in direct response to a search query. The company’s lawyers had every reason to imagine that this was permissible under the legal doctrine known as “fair use,” which protects the public’s right to quote or excerpt copyrighted works in academic papers, news reports, legislative or judicial proceedings, and parody. But the Authors Guild disagreed. In 2005, the Guild and the Association of American Publishers sued Google, arguing that merely scanning the entire work constituted copyright infringement.

Google’s leaders still feel indignant that this grand humanitarian gesture has cost them so much. “We don’t think we should be sued in the first place,” Eric Schmidt recently told search engine guru Danny Sullivan. “I’m happy to be criticized. But the fact of the matter is, we didn’t sue them. They sued us.”

But as the years passed, and the lawsuit traveled through the meat-grinder of settlement negotiations, Google’s mission began to change. By the time the three parties finalized their deal, Google no longer would merely display snippets of works. The company would now be authorized to sell online access to scanned books in their entirety. And libraries and universities could buy subscriptions to the entire catalogue. Proceeds from the sale would be divided between Google and the plaintiffs, and a “book rights registry” would be set up to hold money on behalf of the copyright holders. Suddenly, Google was a crude sort of bookseller.


Few people know more about the future of books and libraries than El Cerrito’s Peter Brantley. He’s spent virtually his entire career working at libraries around the country, specializing in digitizing the archives and organizing them into an accessible and searchable form. As a librarian, Brantley is dedicated to the proposition that everyone should have convenient and affordable access to as much of the written word as possible.

As Brantley moved from UCSF to UC Berkeley and NYU, he began to hear more and more about the Google Books project. Colleagues talked about the sheer scale of the program, and everyone was positively giddy about the prospect. Eventually, Brantley landed at the University of California system, where he was responsible for administering the largest public digital library in the world. There, he was personally involved in negotiating the contract between the university of California and Google Books.

But as he learned more about what Google was planning to do, and talked with fellow librarians, Brantley grew a little perturbed. He read Jeffrey Toobin’s somewhat unsettling article about the project in The New Yorker, and realized that the lawsuit filed by publishers and the Authors Guild could effectively give Google exclusive control over tens of millions of digital books.

“I started looking at the contracts and reading more on intellectual property law,” Brantley said. “I became more concerned with how Google Books was evolving. I have a lot of friends in libraries and publishing, and I was informed that conversations were taking place, and they matched what Toobin had written. … It made me start thinking very seriously about what the ramifications were.”

In October 2008, Brantley took a job with the San Francisco nonprofit group the Internet Archive. One of the country’s most dedicated advocates of open information, the Internet Archive was the first organization to preserve web pages that would otherwise expire with time and vanish into the ether; journalists and researchers routinely use its “Wayback Machine” to find dead web pages. That same month, Google, the Authors Guild, and the Association of American Publishers had finalized a settlement that could forever change the future of publishing. Internet Archive founder Brewster Kahle gave Brantley his first assignment: find out who else didn’t like the deal, get them together, and try to stop it.

To understand why Brantley found the settlement so alarming, you have to understand two elements of the underlying lawsuit. There are essentially three classes of books. Those published before 1923 are no longer protected by copyright law; Google and anyone else can legally publish them to their hearts’ content. Books published after 1923 fall under copyright protection, and of those, Google will not be able to publish the tomes that are still in print. But there are tens of millions of so-called “orphan books,” or books that are under copyright, but out of print. Often, these are academic tomes whose authors or copyright holders cannot be found. These orphan books are at the heart of the controversy.

And here’s why. When the Authors Guild and the Association of American Publishers sued Google, the court granted the lawsuit class-action status. The two plaintiffs were recognized as the representatives of a class consisting of nothing less than every single copyright holder in the universe. As bizarre as it sounds, every author, publisher, or copyright holder on Earth was suddenly having its rights negotiated in court, even if most of them didn’t know it or were even aware that a lawsuit was underway. With this settlement, Google got the right to sell access to tens of millions of orphan books and academic studies in one fell swoop, even if the authors didn’t know it.

Google spokeswoman Jennie Johnson said nonprofits like the Internet Archive are free to go through the same negotiation process. “Anybody can do exactly what we did,” Johnson said.

But according to Brantley, that’s essentially impossible. “Because of the nature of the settlement as a class action, Google would have a unique release of liability from the commercialization of works that went unclaimed in the process of notification of authors and publishers,” Brantley said. “So for most books published from 1923 to the end of the century but not in print, Google has a unique ability to commercialize it. … Amazon, the Internet Archive, or anyone else — they’re not included in the class action. So the only books we can work with are books whose authors we have a relationship with, or are in the public domain.”

Brantley’s campaign wasn’t the first time the Internet Archive had tried to affect the outcome of the Book Search settlement. The group had initially asked the court to include it as a defendant, so that whatever agreement bound Google would also apply to the Archive and free it from negotiating with millions of orphan book authors. But the judge said no — and he wasn’t the only one. “Google could have written the court and supported that. Instead, they wrote a letter opposing that,” said Brantley, who added that he understood Google’s position. “Why would Google want us to produce even a fraction of their comparable product, when it’s our mission to do this for free?”

So any other company or nonprofit seeking to create a similar archive will, as a practical matter, find itself having to negotiate publishing or scanning rights one book at a time, with author after author. Only Google will have negotiated these rights all at once — which means only Google will be able to assemble a digital archive of tens of millions of academic works.

Opponents of the settlement worry that anyone with a monopoly will probably abuse it sooner or later. Even Google, whose benevolence depends on its massive profit margin and the good will of Larry and Sergey. “Google’s record suggests that it will not abuse its double-barreled fiscal-legal power,” concludes Harvard library director Robert Darnton. “But what will happen if its current leaders sell the company or retire?”

No one has better articulated this than Darnton, who once supported but now opposes the Google Books project. Take academic journals, he argues. At their inception decades ago, these journals were produced solely in the spirit of free inquiry and priced accordingly. But today, Darnton noted in the New York Review of Books, a year’s subscription to the Journal of Comparative Neurology costs almost $26,000, and your average chemistry journal subscription costs almost $3,500. Despite the best intentions of professors everywhere, captive markets made academic journals one of the biggest price-gouging scams in history.


Since Google wouldn’t play ball with the Internet Archive, Brantley went to work. He slowly pieced together as many opponents as he could find into what would eventually become the Open Book Alliance. Most of his allies were fellow librarians and groups like the Council of Literary Magazines and Presses, which can’t exactly boast the heaviest legal and political arsenals. But early on Brantley — who is remarkably well-connected for a librarian — snagged an important ally: Microsoft.

It’s no secret that the leaders of Microsoft and Google loathe one another; in 2005, when Microsoft CEO Steve Ballmer learned that Google was poaching one of his top executives, he allegedly threw a chair across the room and screamed, “I’m going to fucking kill Google!” And both companies have a long and storied history of using the antitrust laws to accuse one another of running monopolies, while denying dominance in their own particular fields. Google came to the European Commission and accused Microsoft of illegally dominating the browser market, while Microsoft spent months and millions in a successful effort to persuade the Justice Department to kill a massive advertising partnership between Google and Yahoo.

Now, Brantley tapped his Microsoft connections to bring the company on board. Microsoft would ultimately help Brantley with money, lobbying, and legal expertise. (Earlier this summer, business press outlets reported that Microsoft’s DC lobbyists were holding weekly “screw Google” meetings, mostly dedicated to how federal regulators could make life difficult for their rival.)

Brantley recognized that Microsoft executives may simply want to hurt Google as part of a broader strategy. But he claimed that the company clearly saw this as a monopoly issue as well, particularly in the search market. “As a search company, they’ve got to be concerned about the book data Google would be able to obtain,” Brantley said. “And they’ve got to be concerned about a world where just one company has control over that much information.”

Oddly, Brantley’s next big ally was none other than Microsoft’s worst enemy. Gary Reback has been an antirust attorney working in Silicon Valley for almost thirty years; his work confronting Microsoft led directly to the Justice Department’s late-1990s antitrust case against the software giant. On the advice of a friend, Brantley wrote a cold e-mail to Reback with the Google settlement attached. Fifteen minutes later, Reback was on the phone and ranting about it for an hour. “Gary, who has far more experience about class action, combined his Rolodex with the Rolodex I created, to form the Open Book Alliance.”

Brantley had to finesse Reback’s involvement with Microsoft. But within a few months, he and Reback had assembled an impressive anti-Google coalition with members from the New York Library Association to Microsoft, Amazon.com, and Yahoo. They filed amicus briefs with the court denouncing the settlement and worked the media; their sheer physical proximity to Silicon Valley ensured that the tech press gave their side plenty of copy. Gradually, Google’s spin on the story started to lose traction.


Meanwhile, UC Berkeley law and information school professor Pamela Samuelson was just learning about the settlement. As the co-director of the Berkeley Center for Law and Technology, she has spent years studying how technology affects privacy, copyright, and intellectual property issues. “Most professors during the summer, they do their research, and they don’t spend their summers reading things like the Google Books settlement,” she said. “I read it. It seemed to me to be a lot more complicated than the happy story that Google was telling, which was, ‘Oh gee, everyone will have access to books, isn’t that special?'”

Like Brantley, Samuelson was considerably alarmed by the terms of the settlement, and she aired her objections in a series of influential essays in the Huffington Post and the O’Reilly technology blog. “The Book Search agreement is not really a settlement of a dispute over whether scanning books to index them is fair use,” Samuelson wrote. “It is a major restructuring of the book industry’s future without meaningful government oversight. The market for digitized orphan books could be competitive, but will not be if this settlement is approved as is.”

Even more significantly, Samuelson helped organize a major conference on the merits of the settlement at UC Berkeley in August. Google Books head Dan Clancy was there and did his best to defend the agreement. But once again, the conference aired numerous grievances about the deal in front of the technology press. Once again, Google’s public image took a beating, and the academic and legal community grew even more alarmed.

“The reason that it happened all at once is that Pam organized this conference,” said Geoff Nunberg. “It was a point obviously when this thing was in the news, people were looking for angles to this, and suddenly there are all these people whose credentials would garner attention say look, you’ve got all these problems. And I think that helped to focus people’s attention on the problems.”

Perhaps more than anyone, Nunberg used the conference to highlight some of the settlement’s problems. Like George Lakoff and John McWhorter, Nunberg is a member of that exotic and improbable specie — a celebrity linguist; he’s written numerous books and has a regular guest spot on NPR’s Fresh Air. At the conference, he pointed out, in amusing and devastating detail, yet another problem with Google’s Book archive: it’s riddled with mistakes.

Nunberg doesn’t have a problem with the books themselves; they’ve been accurately scanned and present the text verbatim. His problem lies with the metadata, or information Google publishes about the books: the authors, the publisher, the date of publication, etc. In case after dismaying case, Nunberg found that Google had critically misreported key information about the books they offer users around the world. Tom Wolfe’s novel The Bonfire of the Vanities, for example, is listed as having been published in 1888, and Raymond Chandler’s novel Killer in the Rain is listed as hitting the shelves in 1899. Over and over, Google got important details wrong, potentially misleading future researchers and bookworms around the world.

“It doesn’t matter much to the average user,” Nunberg said. “But it does matter for certain scholarly purposes. “For example, if you’re trying to find all the uses of a word in a certain period. Or if you’re trying to track how the United States became a singular instead of a plural. I can’t do that now, because the dates are so fucked up.”

And because Google has an effective monopoly on the world’s only digital archive, Nunberg added, researchers will come to depend on it, erroneously assuming that Google’s got the details right. “Of course people would use it instead of their local library,” he said. “Who wouldn’t? I use it all the time.” The consequences for literary and historical research could be problematic, to say the least.

As the guffaws subsided from Nunberg’s presentation, chagrined Google engineers promised to clean up their act.


This was just one of the concessions Google offered in August to mollify its critics. The company promised to display or sell digital books that may still be in print overseas, and beefed up its privacy policies, enabling users to hide what books they bought from Google’s bots.

But it was too little, too late, as opposition mounted around the world. The European Commission began scrutinizing the deal in early September, and attorneys general from numerous states began to cast a critical eye on the settlement. The head of the US Register of Copyrights told a congressional committee that the settlement usurped federal authority.

And on September 18, the Justice Department finally weighed in and effectively killed the deal. The royalty-sharing agreement with the plaintiffs amounted to price-fixing, Justice officials declared. Big publishing houses could use the Book Rights Registry to make the price of competing orphan books too high to be commercially viable. Most importantly, they declared that the class-action nature of the lawsuit gave Google an illegal monopoly.

“This de facto exclusivity … appears to create a dangerous probability that only Google would have the ability to market to libraries and institutions a comprehensive digital book subscription,” Justice officials wrote. “The seller of an incomplete database — i.e., one that does not include the millions of orphan works — cannot compete effectively with the seller of a comprehensive product. Foreclosure of newcomers is precisely the kind of competitive effect the Sherman Act is designed to address.” The deal, in other words, was dead in the water.

Now, Google, the Authors Guild, and the Association of American Publishers have until November 9 to revamp the entire settlement and appear before federal judge Denny Chin. But even if they manage to allay the Justice Department’s objections, it’s not at all clear that Brantley and his allies will stop their campaign. Indeed, Brantley suggests that he won’t be satisfied until the archive itself is taken out of private hands altogether.

For example, even if Google were forced to let other archiving entities be covered by the royalty agreement, Brantley argues that it would still have an illegal monopoly on digital books — because no one else will be able to do what the company has already done. “It’s going to be very difficult for us to go back to the libraries and say, ‘We’d like to copy all of your books,” he said. “We just don’t think it’s going to be easy for anyone to create a comparable database.”

As long as Google owns the world’s only digital books archive, Brantley says, it will still have an illegal monopoly. He won’t come right out and say it, but he is effectively suggesting that the only way to preserve what many see as a public trust is to nationalize Google’s archive. “Let’s say there was one national database of scanned books, and it was held under the admin of the Library of Congress,” he said. “And under that, it would be possible for private parties and nonprofits to scan books and make a copy of them for certain limited purposes.” Even then, he conceded, a universe of copyright and intellectual property issues have to be resolved.

So pity poor Google, if you will. Company executives may have thought they were offering a universal public good: making the world’s law, literature, history, and science instantly available for free. And they spent a fortune tediously turning page after page of millions of books to make this happen. Instead, they opened a can of worms they evidently never anticipated, intellectual property problems that no one ever thought they would have to address. Google’s brand was tarnished with suggestions that the company sought an iron grip on human knowledge, when all it may have wanted to do was make search better and sell a lot of ads on a computer screen.

“We’re talking about this volume of work that is not commercially lucrative, but that it’s very important that people have access to,” said spokeswoman Jennie Johnson when discussing the orphan books and Google’s dream. “We just haven’t seen this access to knowledge.”

Now, thanks to the demands of democracy, private property, and free inquiry, that dream is being picked over by lawyers huddling over a conference table. “We’re considering the points raised by the Department of Justice,” Johnson sighed, “and are considering amending the settlement.”

LEAVE A REPLY

Please enter your comment!
Please enter your name here

East Bay Express E-edition East Bay Express E-edition
19,045FansLike
17,560FollowersFollow
61,790FollowersFollow
spot_img