Sunday, July 31, 2011

User-friendliness, a lesson

I was looking for Melvil Dewey's first published version of his classification system. My first instinct was to head to Google Book Search but I decided instead to use HathiTrust as a kind of gesture to non-commercial access. I did find what I was looking for, his 1876 pamphlet, opened it up in their reader and looked through it. I knew I'd want a copy, so I found the "download as PDF" link. That popped up a box telling me to "Login to determine whether you can download this book." The copyright is listed as "public domain in the United States." I don't see why I need to log in, and I downloaded it from GBS instead, without logging in but adding to the slime trail of my life that Google owns. The added step of logging in (to be started by creating yet another login on a system I will use only occasionally), for all that it may be no more or even less invasive of my privacy, is not user-friendly. It also didn't make sense to me at the time, and I was given nothing to convince me that logging in was beneficial ... to me.

Yes, it's all about ME, me the user, me the person at the other end of the connection. I'm also not just any user, I am an advocate of libraries, a librarian, and I made the effort to go to HathiTrust -- a site that has not shown up for me in search engines.

This seems to be such a basic lesson that I do not understand why libraries can't learn it. User-friendliness.


Ooof! It just gets worse. I decided to see what login is about. To get to login, you have to search, select a book, and click on login. On the book page, you may see that a book is "Public Domain" or it may say "Public Domain, Google-digitized". When you log in, you log in either as someone from a member institution or a guest. The guest log in form states:
Does NOT provide access to full PDF downloads of public domain & open access items where not publicly available
However, it turns out that it DOES provide access to PD books (see comment by anonymous) if the book is not digitized by Google -- but that isn't what you've been told. "... not publicly available" isn't what you see on the book page, you see "Google-digitized." The page on policies has two different categories, "Open Access" and "Open Access, Google-digitized." Nothing in the definitions of those categories mentions member and guest downloading.

Basically, HathiTrust turns out to be a tiered system with member and non-member access. You don't encounter this until you try to download something that is PD but not "publicly available." Nothing on the home page mentions that this is a member-based service, therefore you don't know that as a non-member you will encounter walls.

OK, it is resolved, that from now on I will always go first to the Open Library, a site where Open means what I think it should.

Monday, July 25, 2011

RDA in XML - why not give it a shot?

Example of RDA in XML / Example2 of RDA in XML

There's a lot of talk about what we will do with RDA as data - what format we will use, how it will look to users, etc etc etc. In fact, the options are legion. The key point is that we don't have to decide on just ONE WAY to carry and store RDA data elements, as long as we follow a few rules.

As an experiment, I have coded a very simple bibliographic record using two different possible ways to encode RDA in XML. For the XML data elements I use the RDA elements from the Open Metadata Registry. These elements are defined in OWL, and therefore are compatible with semantic web applications. Their use in XML (and by that I mean non-RDF XML) may be a bit questionable, yet at the same time XML may be a good transition format from our current data to a ful RDF-based implementation. I created two XML files: one in which I used text values, much as one would in MARC, and one in which I used URIs for values that have been encoded as vocabularies. Neither has a schema because creating a schema for RDA is a huge undertaking. If there is interest in this method, however, it might be worth... undertaking.

The resulting files don't fit well in a blog post, so I created a page with a side-by-side comparison. Please have a look. Feel free to comment or send me suggestions or corrections. or other ideas on how to do this better.

Wednesday, July 20, 2011

Unequal Access

With the recent indictment of an advocate for open information access who had set up a way to download about 4 million JSTOR articles, presumably with the intent to liberate them from their native closed access, we need to step back and look at how unequal information access is in this world. In major universities in the US, academics and students log on to their computers in their offices or at home and a whole world opens up to them. That's not some kind of accident. The prime goal of university libraries is to make good on "seek and ye shall find." The proof of the success of these libraries is that researchers are oblivious to the complexity of the system that serves them. I would guess that many members of the US university community have no idea how their access to journals is managed and controlled. They don't see the contract negotiations with information providers, the continual development of software that makes single-point searching possible, the multi-faceted delivery systems that blend (or attempt to) digital and paper resources into a single stream. And they don't think about how different it would be if they weren't members of that privileged community.

Contrast that to the access available to a member of the US public who is not part of this academic sector. Like myself. Like the majority of people in this country. There is no access to JSTOR. No openURL server gives me multiple access options. The local public library does have some electronic materials, but these are much less extensive (and less expensive) than the ones in academic libraries. I may have to wait weeks to get a book that isn't in my local library's collection, if I can get it at all. I am often in the embarrassing position of not being able to access articles that I would like to read or quote from, including ones that I myself have authored.

In spite of this, I know that my information access, as a mere member of the US public, is far superior to that found in other countries; countries where serious researchers struggle to participate in research because they do not have the access that many academics here take for granted. Two anecdotes:

-- When I lived in Italy in the 1970's my friends were mainly college students or recent graduates. University education was free, but it was generally accepted that the only way to complete ones final thesis was to be able to afford to go abroad for two or three months. The purpose of this trip was to spend time in a country with a good library system, since libraries in Italy were limited. This was not just for students studying foreign literatures, but even those studying sciences, history, and art. These kids were essentially "library tourists." I don't know if this continues today.

-- During the time I worked at UC I was in a conversation with someone involved in the licensing of databases. For some reason we got talking about enforcement of contractual clauses having to do with excessive downloading and/or piracy. This person told me that all access to one of the UC campuses had been cut off recently for a few days because it was discovered that someone was systematically downloading entire journal runs. When they found the student it turned out that it was a foreign graduate student who would soon be returning home. Knowing that leaving the UC system would mean losing access to the journals he would need to continue his research, he was making himself a copy to take home.

It occurs to me as I write this that the "Digital Public Library of America" could create an information revolution in this country by upgrading the access of the general public to that of an academic or student in a large college or university, without ever digitizing a single page. What makes Stanford "Stanford" or Harvard "Harvard" is not just its famed faculty but the full range of information that is shared by that community. Everything they do, every bit of research, every new idea, is facilitated by the library and its services.

The information access gap between a university researcher and the average person on the street is immense. We have an information elite that, like most elites, considers its position to be earned, just, and reasonable. Few in academia worry that the access they have isn't widely shared. If they did, they would hopefully decide that something should be done.