Open Libraries “… are signs of life and hope: They are the cornerstone of democracy”

Posts from March 2007

Posted
19 March 2007 @ 1pm

Tagged
Quotes

Kafka

“There are two cardinal human vices, from which all the others derive their being: impatience and carelessness. Impatience got people evicted from Paradise; carelessness kept them from making their way back there. Or perhaps there is one cardinal vice: impatience. Impatience got people evicted, and impatience kept them from making their way back.” –Franz Kafka, The Zurau Aphorisms



Life Archive

Many libraries including the Greenwich Public Library (CT) have oral history collections of residents, famous and not so famous. But as the US population ages, people are starting to wonder if what they’re creating online will survive them.

Libraries have always kept some kind of vertical file for local residents. The DeKalb Public Library (IL) has a file on author Richard Powers, which proved recently valuable when The Echo Maker won the National Book award.

Perhaps it’s time for libraries to run their own blog aggregators, so that the next Richard Powers’ juvenelia can be preserved for posterity. Open source aggregators exist, from Gregarius (PHP) to Planet (Python) to Plagger (Perl).

Dave Winer, popular for an early vision of weblogs, RSS, and podcasting, among other things, wrote in a post entitled Future Archives, “When a scholar dies, he or she leaves behind a life of work, papers, unfinished manuscripts, notebooks, pictures, recordings, and nowadays computers, disks and websites. Their family and university generally don’t know what to do with them, often the problem is given to the libraries.” Winer went on to say, “Our thought is to try to anticipate the problem, while the scholar is alive, and now that our work is largely electronic, to have it future-safe at all times, leave no work for the librarian, let the families and colleagues deal with the death of a relative and colleague at a personal level, and not as a professional problem.”

Amazon’s Simple Storage Service (S3), which offers metered storage on its servers, has been discussed as one possible solution. Other internet service providers have seen this need, and offer their own solutions. Joyent has Strongspace, which promises to give “a secure place to gather, backup and share any type of file.” Dreamhost has Files Forever, which promises to “keep uploaded files private [to use] as a permanent archive.”

Solution within reach
Jon Udell, Microsoft technology evangelist and pioneer of LibraryLookup, has been thinking along the same lines, writing “I have ventured into this confusing landscape because I think that the issues that libraries and academic publishers are wrestling with — persistent long-term storage, permanent URLs, reliable citation indexing and analysis — are ones that will matter to many businesses and individuals. As we project our corporate, professional, and personal identities onto the web, we’ll start to see that the long-term stability of those projections is valuable and worth paying for.

In Udell’s podcast with Dan Chudnov, librarian and technologist, they discuss possible alternatives. Chudnov went on to post a vision of what a library project dedicated to archiving weblogs would look like from a 2004 conference discussion (see below), since updated to include Atom instead. This service, which mirrors the journal archive service LOCKSS (Lots of Copies Keep Stuff Safe), holds promise for keeping electronic content from falling into a digital black hole.

Weblog mirroring system diagram, originally uploaded by dchud.



code4lib 2007

Working Code Wins
Responding to increasing consolidation in the ILS market, library developers demonstrated alternatives and supplements to library software at the second annual code4lib conference in Athens, GA, February 27-March 2, 2007. With 140 registered attendees from many states and several countries, including Canada and the United Kingdom, the conference was a hot destination for a previously isolated group of developers.

Network connectivity was a challenge for the Georgia Center for Continuing Education, but the hyperconnected group kept things interesting and the attendees coordinated by Roy Tennant artfully architected workarounds and improvements as the conference progressed.

In a nice mixture of emerging conference trends, code4lib combined the flexibility of the unconference with 20 minute prepared talks, keynotes, five minute Lightning Talks, and breakout sessions. The form was derived from Access, the Canadian library conference.

Keynotes
The conference opened with a talk from Karen Schneider, associate director for technology and research at Florida State University. She challenged the attendees to sell open source software to directors in terms of solutions it provides, since the larger issue in libraries is saving digital information. Schneider also debated Ben Ostrowsky, systems librarian at the Tampa Bay Library Consortium, about the importance of open source software from the stage, to which Ostrowsky responded, “Isn’t that Firefox [a popular open source browser] you’re using there?”

Erik Hatcher, author of Lucene in Action, gave a keynote about using the full-text search server, Apache Solr, open-source search engine Lucene and faceted browser, Flare, to construct a new front-end to library catalog data. The previous day, Hatcher led a free preconference for 80 librarians who brought exported MARC records, including Villanova University and the University of Virginia.

Buzz
One of the best-received talks revolved around BibApp, an “institutional bibliography” written in Ruby on Rails by Nate Vack and Eric Larson, two librarians at the University of Wisconsin-Madison. The prototype application is available for download, but currently relies on citation data from engineering databases to construct a profile of popular journals, publishers, citation types, and who researchers are publishing with. “This is copywrong, which is sometimes what you have to do to construct digital library projects. Then you get money to license it,” Larson said.

More controversially, Luis Salazar gave a talk about using Linux to power public computing in the Howard County (MD) public library system. A former NSA systems administrator, he presented the pros and cons of supporting 300 staff and 400 public access computers using Groovix, a customized Linux distribution. Since the abundant number of computers serves the public without needing sign up sheets, “patrons are able to sit down and do what they want.”

Salazar created a script for monitoring all the public computers, and described how he engaged in a dialog with a patron he dubbed “Hacker Jon,” who used the library computers to develop his nascent scripting skills. Bess Sadler, librarian and metadata services specialist at the University of Virginia, asked about the privacy implications of monitoring patrons. “Do you have a click-through agreement? Privacy Policy?” she asked. Salazar joked that “It’s Maryland, we’re like a communist country” and said he wouldn’t do anything in a public library that he wouldn’t expect to be monitored.

Casey Durfee presented a talk on “Endeca in 250 lines of code or less,” which showed a prototype of faceted searching at the Seattle Public Library. The new catalog front-end sits on top of a Horizon catalog, and uses Python and Solr to present results in an elegant display, from a Google-inspired single search box start to rich subject browse options.

The future
This year’s sponsors included Talis, LibLime, OCLC, Logical Choice Technologies, and Oregon State University. OSU awarded two scholarships to Nicole Engard, Jenkins Law Library (2007 LJ Mover and Shaker), and Joshua Gomez, Getty Research Institute.

Next year’s conference will be held in Portland, OR.