Monday, November 28, 2016

All the Books

I just joined the Book of the Month Club. This is a throwback to my childhood, because my parents were members when I was young, and I still have some of the books they received through the club. I joined because my reading habits are narrowing, and I need someone to recommend books to me. And that brings me to "All the Books."

"All the Books" is a writing project I've had on my computer and in notes ever since Google announced that it was digitizing all the books in the world. (It did not do this.) The project was lauded in an article by Kevin Kelley in the New York Times Magazine of May 14, 2006, which he prefaced with:

"What will happen to books? Reader, take heart! Publisher, be very, very afraid. Internet search engines will set them free. A manifesto."

There are a number of things to say about All the Books. First, one would need to define "All" and "Books". (We can probably take "the" as it is.) The Google scanning projects defined this as "all the bound volumes on the shelves of certain libraries, unless they had physical problems that prevented scanning." This of course defines neither "All" nor "Books".

Next, one would need to gather the use cases for this digital corpus. Through the HathiTrust project we know that a small number of scholars are using the digital files for research into language usage over time. Others are using the the files to search for specific words or names, discovering new sources of information about possibly obscure topics. As far as I can tell, no one is using these files to read books. The Open Library, on the other hand, is lending digitized books as ebooks for reading. This brings us to the statement that was made by a Questia sales person many years ago, when there were no ebooks and screens were those flickery CRTs: "Our books are for research, not reading." Given that their audience was undergraduate students trying to finish a paper by 9:30 a.m. the next morning, this was an actual use case with actual users. But the fact that one does research in texts one does not read is, of course, not ideal from a knowledge acquisition point of view.

My biggest beef with "All the Books" is that it treats them as an undifferentiated mass, as if all the books are equal. I always come back to the fact that if you read one book every week for 60 years (which is a good pace) you will have read 3,120. Up that to two books a week and you've covered 6,240 of the estimated 200-300 million books represented in WorldCat. The problem isn't that we don't have enough books to read; the problem is finding the 3-6,000 books that will give us the knowledge we need to face life, and be a source of pleasure while we do so. "All the Books" ignores the heights of knowledge, of culture, and of art that can be found in some of the books. Like Sarah Palin's response to the question "Which newspapers form your world view?", "all of them" is inherently an anti-intellectual answer, either by someone who doesn't read any of them, or who isn't able to distinguish the differences.

"All the Books" is a complex concept. It includes religious identity; the effect of printing on book dissemination; the loss of Latin as a universal language for scholars; the rise of non-textual media. I hope to hunker down and write this piece, but meanwhile, this is a taste.