Posted in June 2006

Open Archives Initiative

Following the success of Open URL, the Open Archives Initiative has been one of the most promising development in the digital library world. Tools like Oaister (pronounced oyster), the National Science Digital Library, and the IMLS Digital Collections Registry show there has been a dramatic uptake in the number of libraries and tools that have implemented it.

This relatively light-weight protocol was designed to make sharing of metadata as simple as RSS aggregation. As the number of adoptors has risen, the aggregators have seen a few XML-related snags.

In short, metadata is user input. First law of programming: Never trust user input.

Many papers at library conferences are designed to showcase a particular implementation that went better than expected. That’s great–it’s always good to see libraries succeeding. However, it takes much more courage to share lessons learned, so that pitfalls can be avoided.

The winning paper at JCDI 2006 was written by Carl Lagoze, one of the original architects of the OAI protocol. In the paper, “Metadata aggregation and “automated digital libraries”: A retrospective on the NSDL experience.” he shares his rude awakening that many OAI archives are stuck with XML that don’t validate, which makes aggregators like the NSDL subject to truckloads of autogenerated emails.

As Dorothea’s commentary put it:

“The winning non-student paper both amused and frustrated me. Carl Lagoze talked about the National Science Digital Library, and how it was believed that the Magic Metadata Fairy would use OAI-PMH to build a beautiful searchable garden of science, and how everyone ended up with an ugly, weed-choked, cracked-asphalt vacant lot instead.”

She goes on to say what few technologists want to say. People still matter.

“I’ll be blunt. The solution for NSDL’s problem is hiring cataloguers, or metadata librarians, or indexers/abstracters, or whatever you want to call ’em, to clean up the incoming garbage. Ideally, OAI-PMH would be a two-way protocol, so that nice cleaned-up metadata made its way back to the repository that had spewed the garbage in the first place. That, however (despite all the jaw-flapping about frameworks that went on during JCDL) does not seem to be in the offing. It should be.”

Catalogers still matter. Especially the new breed of catalogers.

Open World Cat

In typical OCLC style, a quiet revolution is brewing. Formerly a subscription-only database, WorldCat has begun to progagate into search engines–Google, Yahoo, and Ask in particular–and with the merger of RLG, it looks like a truly spectacular interface could be created to the union catalog.

In the meantime, it’s curious that OCLC chose to use an ISBN-based permalink structure instead of OpenURL. It does showcase FRBR, but beyond that it’s not very interoperable.

The real question is, will OCLC enter the SEO (search engine optimization) business so that library results show on the first page?

Open URL

Open URL solves the appropriate copy issue, but many other questions have sprung up for library discussion.

You can learn more in Roy Tennant and Carol Tenopir‘s forthcoming July columns.

  • Should Google have a list of resolvers? What about Microsoft?
  • Is it useful for OCLC to be developing a registry?
  • Why is the usability so poor? Pop window after pop up window…
  • Do users want a limit to full-text programmed for them?
  • Should it be as easy as writing a weblog entry to link to library subscription resources? The inventors of COinS think so.

Open Source

When Linux and Apache were familiar to the sysadmin community, Dan Chudnov had the foresight to develop a site for librarians to swap tips about open source software that would be useful in libraries.

Now, with libraries contemplating the advantages of home-grown open source library catalogs anew, open source expertise has become even more valuable.

The State of Georgia has made a committment to developing an open-source catalog, so things are starting to get interesting

Open Search

Z39.50 has been a useful technology for searching library catalogs and individual databases for many years, but it presents certain implementation challenges–Bath or SUTRS?

The Open Search specification is interesting, since it promises much the same thing. It was initiated by a commercial entity–A9–and some libraries are starting to pay attention to it as a supplement for SR/U and Z39.50.

Open Content

Brewster Kahle and the Open Content Alliance are doing some interesting and credible things.

It’s especially interesting to see the open source software being made available from it, like Dojo.

Some of the scans are quite beautiful, like this Henry James book.

Open Access

It’s laudable to make the entirety of human knowledge, especially scientific, available for free. But what about that free lunch?

news @ nature.com – Open-access journal hits rocky times – Financial analysis reveals dependence on philanthropy.

The Public Library of Science (PLoS), the flagship publisher for the open-access publishing movement, faces a looming financial crisis. An analysis of the company’s accounts, obtained by Nature, shows that the company falls far short of its stated goal of quickly breaking even. In an attempt to redress its finances, PLoS will next month hike the charge for publishing in its journals from US$1,500 per article to as much as $2,500.

In the beginning, libraries were excited about the open access movement because it promised to save them money from the serials budget. However, as Phil Davis pointed out last year, libraries still face the price of print subscriptions, plus membership fees, as well as having to subsidize author submission fees. From this angle, open access looks like less of a bargain than a mechanism to subsidize research and development for new publications.

Why Open?

As I looked through my del.icio.us tags, it occured to me that many recent library initiatives seem to center around being Open.

It could be argued that weblogs are an exercise in openess, and I’ll start by getting the themes for the week off the ground.

Open Access
Open Content
Open Search
Open Source
Open URL

After this, I’m heading down to New Orleans for ALA and will post more news from the land of the Confederacy of Dunces.

Welcome

This blog is now open for business (sorry, terrible pun, couldn’t resist).

As a new editor at LJ, I’m interested in making the content more relevant to readers, like, well, me.

As a former systems librarian, I first wanted to organize the content in a way that made sense to me. So you can browse the LJ news and Roy Tennant’s column. Ebsco subscribers can subscribe to a feed of LJ articles, too.