Personal Digital Archiving

I presented about Constructing a Digital Identity Compatible with Institutional Archives at this excellent conference at the Internet Archive.

Personal Archiving Systems and Interfaces for Institutions. What are the experiences and design decisions of institutions that have built systems for personal digital archives?

SPARC Innovation Fair

At the podium

From a brief talk given 8 November at the SPARC 2010 Digital Repositories Forum:
Hello, I’m Jay Datema, associate director at the Bern Dibner Library, Polytechnic Institute of NYU. I’m honored to be included in this year’s Innovation Fair at the SPARC conference. I have two minutes, so I’ll keep it short.

My poster is entitled “Full Circle Research: Occam’s Razor for Collection.” As many of you know, Occam’s Razor is a principle taken from the philosopher William of Ockam, who posited that “when several theories model the available facts adequately, the simplest theory is to be preferred.”  This principle dates back to the 1300s, so it’s had some time to prove itself. Institutional repositories, on the other hand, are just a decade old.

Simply stated, my poster shows that research is a process that starts with an analysis of publications, which of course will then produce more publications. As Samuel Johnson said, “The greatest part of a writer’s time is spent in reading, in order to write; a man will turn over half a library to make one book.” What is the online equivalent? I suppose it would have to be endless surfing of bibliographies, databases, and PDFs. Research only ends when your attention span falters or a deadline awaits.
Continue reading

ALA 2007: Online Books, Copyright, and User Preferences

Ben Bunnell, Google library partnership manager, and Cliff Guren, Microsoft director of publisher evangelism, presented their view of the future to reference publishers June 22 during ALA at the Independent Reference Publishers Group meeting.

Google moves into reference
Bunnell said it was his first time presenting to publishers instead of librarians, and he gave a brief overview of the Google Books program. It has now digitized one million of 65 million books worldwide, and has added Spanish language books to its collections via partnerships with the University of Texas Austin and the University of Madrid. Google is finding that librarians have been using Book Search for acquisitions, which is a somewhat unexpected use.

Microsoft innovates behind
Cliff Guren said Microsoft’s goal is to turn web search into information search. “The reality is that 5 percent of the world’s information is digitized, less than 1 percent of the National Archives and less than 5 percent of the Library of Congress.”

Guren described new initiatives within Live Search, first launched in April 2006, including a partnership with Ingram to store copies of digitized texts, and agreements with CrossRef, Highwire, Eric, and JSTOR for metadata, and Books in Print data. Live Academic Search currently has 40 million articles from 30,000 journals, and includes books from “out of copyright content only.” Library partners include the University of California, the University of Toronto, Cornell University, the New York Public Library, and the British Library. Technology partners include Kirtas Technologies and the Internet Archive, recently declared a library in its own right by the State of California.New features in Live Book Search include options for publishers to retain control, including displaying percent viewable, image blocking, pages forward and back, and a page range exclusion modifier which also shows the user the number of pages alloted. The most unique feature shown was a view of the book page with a highlighted snippet.

Libraries negotiate collaboratively
Mark Sandler, director of CIC library initiatives, followed the sales presentations with some “inconvenient truths.” Sandler said library print legacy collections are deteriorating, some content has been lost in research libraries, and that “users prefer electronic access.”Stating the obvious, Sandler said “we can’t sustain hybridity,” referring to overlapping print and electronic collection building. More controversially, he made the claim that “Maybe we’re not in the book business after all.”Sandler said books take many shapes in libraries, including ebooks, database content, audiobooks, and that pricing models have shifted to include aggregate collections and “by the drink.”With legacy collections digitized, including the American Memory Project, Making of America, Documenting the American South, Valley of the Shadow, and Wright’s American Fiction, libraries had an early start with these types of projects. But with Google’s mission of organizing all the world’s information and making it universally accessible, Sandler claimed libraries are at the point of no return vis a vis change.With library partnerships with not only Google and Microsoft, but also Amazon, the Million Book Project (MBP), and new royalty arrangements, Sandler said there’s a world of new work for libraries to do, including using digitized texts to make transformative works with math, chemical equations, and music to archive, integrate and aggregate content.

Millenials
Lynn Silipigni Connaway, OCLC Research, and Marie Radford, Rutgers University associate professor, described their IMLS-funded grant on millenials’ research patterns. Using a somewhat ill-conceived reproduction of a chat reference interaction gone awry, Connaway and Radford talked about “screenagers” and described user frustration with current reference tools.”Libraries need to build query share,” Connaway said. Their research intends to study non-users, as well as experiential users and learners. One of the initial issues is since students have been taught to guard privacy online, librarians can be viewed as “psychos and internet stalkers” when they enter online environments like Facebook and MySpace.

What’s in it for us?
Reference publishers asked Google and Microsoft representatives, “What’s in it for us to collaborate with you?”

Cliff Guren said, “If I were in your business, I would be scared–your real competition is Wikipedia.” Bunnell deflected the question, saying “librarians use Google Book Search” and advised publishers to “try a few books and see what happens.” Bunnell said he had been surprised to see thesaurus content and other reference books added by publishers, as he had thought they would be outside the scope. “Yet Merriam-Webster added their synonyms dictionary, and they seem to be pleased.”Guerin said,”We think we’re adding value for independent publishers,” but “if there are 400 reference works on the history of jazz, perhaps there will only be 5 or 10 needed in the future because of the inefficiencies of the print system.” Bunnell countered this point with an example, saying, “Cambridge University Press is using Google Book stats to determine what backlist books to bring back into print.”John Dove, Credo CEO (formerly xRefer), spoke about the real difference between facts and knowledge, and that “facts should be open to all.” Connaway said OCLC is finding that WorldCat.org referral traffic stats show 50 percent of users come from Google Book Search, 40 percent from Libraries, and 9 percent from blogs and wikis.

Future of print?
Gale Reference said they are seeing declining profits from print reference, and asked,”What’s the life of a reference book? Does it have 5 or 10 years left?” Radford answered by saying “I think the paper reference book will be disappearing.” She said all New Jersey universities will share reference collections because of lack of space and funds. Guren was more encouraging, saying “There’s still a need for what you [reference publishers] do. Reference information is needed, though perhaps a reference book is not.”

ALA 2007: Top Tech Trends

At the ALA Top Tech Trends Panel, panelists including Marshall Breeding, Roy Tennant, Karen Coombs, and John Blyberg discussed RFID, open source adoption in libraries, and the importance of privacy.

Marshall Breeding, director for innovative technologies and research at Vanderbilt University Libraries (TN), started the Top Tech Trends panel by referencing his LJ Automation Marketplace article, “An Industry Redefined,” which predicted “unprecedented disruption” in the ILS market. Breeding said 60 percent of the libraries in one state are facing a migration due to the Sirsi/Dynix product roadmap being changed, but he said “not all ILS companies are the same.”

Breeding said open source is new to the ILS world as a product, even though it’s been used as infrastructure in libraries for many years. Interest has now expanded to the decision makers. The Evergreen PINES project in Georgia, with 55 of 58 counties participating, was “mostly successful.” With the recent decision to adopt Evergreen in British Columbia, there is movement to open source solutions, though Breeding cautioned it is “still miniscule compared to most libraries.”

Questioning the switch being compared to an avalanche, Breeding said several commercial support companies have sprung up to serve the open source ILS market, including Liblime, Equinox, and CARe Affiliates. Breeding predicted an era of “new decoupled interfaces.”

John Blyberg, head of technology and digital initiatives at Darien Public Library (CT), said the “back end [in the ILS] needs to be shored up because it has a ripple effect” on other services. Blyberg said RFID is coming, and it makes sense for use in sorting and book storage, echoing Lori Ayre’s point that libraries “need to support the distribution demands of the Long Tail.” Feeling that “privacy concerns are non-starters, because RFID is essentially a barcode,” he said the RFID information is stored in a database, which should be the focus of security concerns.

Finally, Blyberg said that vendor interoperability and a democratic approach to development is needed in the age of Innovative’s Encore and Ex Libris’ Primo, both which can be used with different ILS systems and can decouple the public catalog from the ILS. With the xTensible catalog (xC) and Evergreen coming along, Blyberg said there was a need for funding and partners to further enhance their development.

Walt Crawford of OCLC/RLG said the problem with RFID is the potential of having patron barcodes chipped, which could “lead to the erosion of patron privacy.” Intruders could datamine who’s reading what, which Crawford said is a serious issue.

Joan Frye Williams countered that both Blyberg and Crawford were “insisting on using logic on what is essentially a political problem.” Breeding agreed, saying that airport security could scan chips, and “my concern is that third generation RFID chips may not be readable in 30 years, much less the hundreds of years that we expect barcodes to be around for.”

Karen Coombs, head of web services at the University of Houston (TX), listed three trends:
• The end user as content contributor, which she cautioned was an issue. “What happens if YouTube goes under and people lose their memories?” Coombs pointed to the project with the National Library of Australia and its partnership with Flickr as a positive development.
• Digital as format of choice for users, pointing out iTunes for music and Joost for video. Coombs said there is currently “no way for libraries to provide this to users, especially in public libraries.” Though companies like Overdrive and Recorded Books exist to serve this need, perhaps her point was that the consumer adoption has superseded current library demand.
• A blurred line between desktop and web applications, which Coombs demonstrated with YouTube remixer and Google Gears, “which lets you read your feeds when you’re offline.”

John Blyberg responded to these trends, saying that he sees academic libraries pursuing semantic web technologies, including developing ontologies. Coombs disagreed with this assessment, saying that “libraries have lots of badly-tagged HTML pages.” Roy Tennant agreed, “If the semantic web arrives, buy yourself some ice skates, because hell will have frozen over.”

Breeding said that he longs for “SOA [services-oriented architecture] but I’m not holding my breath.” And Walt Crawford said, “Roy is right—most content providers don’t provide enough detail, and they make easy things complicated and don’t tackle the hard things.” Coombs pointed out, “People are too concerned with what things look like,” but Crawford interjected, “not too concerned.”

Roy Tennant, OCLC senior program manager, listed his trends:
• Demise of the catalog, which should push the OPAC into the back room where it belongs and elevate discovery tools like Primo and Encore, as well as OCLC WorldCat Local.
• Software as a Service (SaaS), formerly known as ASP and hosted services, which means librarians “don’t have to babysit machines, and is a great thing for lots of librarians.”
• Intense marketplace uncertainty due to the private equity buyouts of ExLibris and SirsiDynix and the rise of Evergreen and Koha looming options. Tennant also said he sees “WorldCat Local as a disruptive influence.” Aside from the ILS, the abstract and indexing (A&I) services are being disintermediated as Google and OCLC are going direct to publishers to license content.
Someone asked if libraries should get rid of local catalogs, and Tennant said “only when it fits local needs.”

Walt Crawford said:
• Privacy still matters. Crawford questioned if patrons really wanted libraries to turn into Amazon in an era of government data mining and inferences which could track a ten year patron borrowing pattern.
• The slow library movement, which argues that locality is vital to libraries, mindfulness matters, and open source software should be used “where it works”
• The role of the public library as publisher. Crawford pointed out libraries in Charlotte-Mecklenberg County, libraries in Vermont that Jessamyn West works with, and Wyoming as farther along this path, and said the “tools are good enough that it’s becoming practical.”

Blyberg said that systems “need to be more open to the data that we put in there.” Williams said that content must be “disaggregatable and remixable, and Coombs pointed out the current difficulty of swapping out ILS modules, and said ERM was a huge issue. Tennant referenced the Talis platform, and said one of Evergreen’s innovations is its use of the XMPP (Jabber) protocol, which is “easier than SOAP web services, which are too heavyweight.”

Marshall Breeding responded to a question asking if MARC was dead, saying “I’m married to a cataloger, but we do need things in addition to MARC, which is good for books, like Dublin Core and ONIX.” Coombs pointed out that MARCXML is a mess because it’s retrofitted and doesn’t leverage the power of XML. Crawford said, “I like to give Roy [Tennant] a hard time about his phrase ‘MARC is dead,” and for a dying format, the Moen panel was full at 8 a.m.

Questioners asked what happens when “the one server” goes down, and Blyberg responded, “What if your T-1 line goes down?” Joan Frye Williams exhorted the audience to “examine your consciences when you ask vendors how to spend their time.” Coombs agreed, saying that her experience on user groups had exposed her to “crazy competing needs that vendors are faced with—[they] are spread way too thin.” Williams said there are natural transition points and she spoke darkly of a “pyramid scheme” and that you “get the vendors you deserve.” Coombs agreed, saying, “Feature creep and managing expectations is a fiercely difficult job, and open source developers and support staff are different people.”

Joan Frye Williams, information technology consultant, listed:
• New menu of end-user focused technologies. Williams said she worked in libraries when the typewriter was replaced by an OCLC machine, and libraries are still not using technology strategically. “Technology is not a checklist,” Williams chided, saying that the 23 Things movement of teaching new skills to library staff was insufficient.
• Ability for libraries to assume development responsibility in concert with end-users
• Have to make things more convenient, adopting (AI) artificial intelligence principles of self-organizing systems. Williams said, “If computers can learn from their mistakes, why can’t we?”

Someone asked why libraries are still using the ILS. Coombs said it’s a financial issue, and Breeding responded sharply, saying, “How can we not automate our libraries?” Walt Crawford agreed, saying, “Are we going to return to index cards?”
When the panel was asked if library home pages would disappear, Crawford and Blyberg both said they would be surprised. Williams said “the product of the [library] website is the user experience.” She said Yorba Linda Public Library (CA) is enhancing their site with a live book feed that updates “as books are checked in, a feed scrolls on the site.”

And another audience member asked why the panel didn’t cover toys and protocols. Crawford said “outcomes matter,” and Coombs agreed, saying “I’m a toy geek but it’s the user that matters.” Many participants talked about their use of Twitter, and Coombs said portable applications on a USB drive have the potential to change public computing in libraries. Tennant recommended viewing the Photosynth demo, first shown at the TED conference.
Finally, when asked how to keep up with trends, especially for new systems librarians, Coombs said, “It depends what kind of library you’re working in. Find a network—ask questions on the code4lib [IRC] channel.”

Blyberg recommended constructing a “well-rounded blogroll” that includes sites from the humanities, sciences, and library and information science will help you be a well-rounded feed reader.” Tennant recommended a “gasp—dead tree magazine, Business 2.0,” Coombs said the Gartner website has good information about technology adoptions, and Williams recommended trendwatch.com.

Links to other trends:
Karen Coombs’ Top Technology Trends
Meredith Farkas’ Top Technology Trends
3 Trends and a Baby (Jeremy Frumkin)
Some Trends from the LiB (Sarah Hougton-Jan)
“Sum” Top Tech Trends for the Summer of 2007 (Eric Lease Morgan)

And other writeups and podcast:
Rob Styles
Ellen Ward
Chad Haefele

Presenting at ALA panel on Future of Information Retrieval

The Future of Information Retrieval

Ron Miller, Director of Product Management, HW Wilson, hosts a panel of industry leaders including:
Mike Buschman, Program Manager, Windows Live Academic, Microsoft.
R. David Lankes, PhD, Director of the Information Institute of Syracuse, and Associate Professor, School of Information Studies, Syracuse University.
Marydee Ojala, Editor, ONLINE, and contributing feature and news writer to Information Today, Searcher, EContent, Computers in Libraries, among other publications.
Jay Datema, Technology Editor, Library Journal

Add to calendar:
Monday, 25 June 2007
8-10 a.m, Room 103b
Preliminary slides and audio attached.

IDPF: Google and Harvard

Libraries And Publishers
At the 2007 International Digital Publishing Forum (IDPF) in New York May 9th, publishers and vendors discussed the future of ebooks in an age increasingly dominated by large-scale digitization projects funded by the deep pockets of Google and Microsoft.

In a departure from the other panels, which discussed digital warehouses and repositories, both planned and in production from Random House and HarperCollins, Peter Brantley, executive director of the Digital Library Federation and Dale Flecker of Harvard University Library made a passionate case for libraries in an era of information as a commodity.

Brantley began by mentioning the Library Project on Flickr, and led with a slightly ominous series of slides:
 “Libraries buy books (For a while longer), followed by “Libraries don’t always own what’s in the book, just the book (the “thing” of the book).



He then reiterated the classic rights that libraries protect: The Right to Borrow, Right to Browse, Right to Privacy, and Right to Learn, and warned that “some people may become disenfranchised in the the digital world, when access to the network becomes cheaper than physical things.” Given the presentation that followed from Tom Turvey, director of the Google Book Search project, this made sense.

Brantley made two additional points, saying “Libraries must permanently hold the wealth of our many cultures to preserve fundamental Rights, and Access to books must be either free or low-cost for the world’s poor.”

 He departed from conventional thinking on access, though, when he argued that this low-cost access didn’t need to include fiction. Traditionally, libraries began as subscription libraries for those who couldn’t afford to purchase fiction in drugstores and other commercial venues.

Finally, Brantley said that books will become communities as they are integrated, multiplied, fragmented, collaborative, and shared, and publishing itself will be reinvented. Yet his conclusion contained an air of inevitability, as he said, “Libraries and publishers can change the world, or it will be transformed anyway.”



A podcast recording of his talk is available on his site.

Google Drops A Bomb
Google presented a plan to entice publishers to buy into two upcoming models for making money from Google Book Search, including a weekly rental “that resembles a library loan” and a purchase option, “much like a bookstore,” said Tom Turvey, director of Google Book Search Partnerships.

 The personal library would allow search across the books, expiration and rental, and copy and paste. No pricing was announced. Google has been previewing the program at events including the London Book Fair.

Turvey said Google Book Search is live in 70 countries and eight languages. Ten years ago, zero percent of consumers clicked before buying books online, and now $4 billion of books are purchased online. “We think that’s a market,”Turvey said, “and we think of ourselves as the switchboard.”

Turvey, who previously worked at bn.com and ebrary, said publishers receive the majority of the revenue share as well as free marketing tools, site-brandable search inside a book with restricted buy links, and fetch and push statistical reporting.

He said an iTunes for Books was unlikely, since books don’t have one device, model or user experience that works across all categories. Different verticals like fiction, reference, and science, technology and medicine (STM), require a different user experience, Turvey said.

Publishers including SparkNotes requested a way to make money from enabling a full view of their content on Google Books, as did many travel publishers. Most other books are limited to 20 percent visibility, although Turvey said there is a direct correlation between the number of pages viewed and subsequent purchases.

This program raises significant privacy questions. If Google has records that can be correlated with all the other information it stores, this is the polar opposite of what librarians have espoused about intellectual freedom and the privacy of circulation records. Additionally, the quality control questions are significant and growing, voiced by historian Robert Townsend and others.

Libraries are a large market segment to publishers. It seems reasonable to voice concerns about this proposal at this stage, especially those libraries who haven’t already been bought and sold.

 Others at the forum were skeptical. Jim Kennedy, vice president and director at the Associated Press, said, “The Google guy’s story is always the same: Send us your content and we’ll monetize it.”

Ebooks Ejournals And Libraries
Dale Flecker of the Harvard University Library gave a historical overview of the challenges libraries have grappled with in the era of digital information.



Instead of talking about ebooks, which he said represent only two percent of usage at Harvard, Flecker described eight challenges about ejournals, which are now “core to what libraries do” and have been in existence for 15-20 years. Library consultant October Ivins challenged this statistic about ebook usage as irrelevant, saying “Harvard isn’t typical.” She said there were 20 ebook platforms demonstrated at the 2006 Charleston Conference, though discovery is still an issue.

First, licensing is a big deal. There were several early questions: Who is a user? What can they do? Who polices behavior? What about guaranteed performance and license lapses? Flecker said that in an interesting shift, there is a move away from licenses to “shared understandings,” where content is acquired via purchase orders.



Second, archiving is a difficult issue. Harvard began in 1630, and has especially rich 18th century print collections, so it has been aware that “libraries buy for the ages.” The sticky issues come with remote and perpetual access, and what happens when a publisher ceases publishing.

Flecker didn’t mention library projects like LOCKSS or Portico in his presentation, though they do exist to answer those needs. He did say that “DRM is a bad actor” and it’s technically challenging to archive digital content. Though there have been various initiatives from libraries, publishers, and third parties, he said “Publishers have backed out,” and there are open questions about rights, responsibilities, and who pays for what. In the question and answer period that followed, Flecker said Harvard “gives lots of money” to Portico.”



Third, aggregation is common. Most ejournal content is licensed in bundles and consortia and buying clubs are common. Aggregated platforms provide useful search options and intercontent functionality.

Fourth, statistics matter, since they show utility and value for money spent. Though the COUNTER standard is well-defined and SUSHI gives a protocol for exchange of multiple stats, everyone counts differently.

Fifth, discovery is critical. Publishers have learned that making content discoverable increases use and value. At first, metadata was perceived to be intellectual property (as it still is, apparently), but then there was a grudging acceptance and finally, enthusiastic participation. It was unclear which metadata Flecker was describing, since many publisher abstracts are still regarded as intellectual property. He said Google is now a critical part of the discovery process.

Linkage was the sixth point. Linking started with citations, when publishers and aggregators realized that many footnotes contained links to articles that were also online. Bilateral agreements came next, and finally, the Digital Object Identifier (DOI) generalized the infrastructure and helped solve the “appropriate copy” problem, along with OpenURL. With this solution came true interpublished, interplatform, persistent and actionable links which are now growing beyond citations.

Seventh, there are early glimpses of text mining in ejournals. Text is being used as fodder for computational analysis, not just individual reading. This has required somewhat different licenses geared for computation, and also needs a different level of technical support.

Last, there are continuing requirements for scholarly citation that is:• Unambiguous• Persistent• At a meaningful level. Article level linking in journals has proven to be sufficient, but the equivalent for books (the page? chapter? paragraph?) has not been established in an era of reflowable text.

In the previous panel, Peter Brantley asked the presenters on digital warehouses about persistent URLS to books, and if ISBNs would be used to construct those URLs. There was total silence, and then LibreDigital volunteered that redirects could be enabled at publisher request.

As WorldCat.org links have also switched from ISBN to OCLC number for permanlinks, this seems like an interesting question to solve and discuss. Will the canonical URL for a book point to Amazon, Google, OCLC, or OpenLibrary?

code4lib 2007

Working Code Wins
Responding to increasing consolidation in the ILS market, library developers demonstrated alternatives and supplements to library software at the second annual code4lib conference in Athens, GA, February 27-March 2, 2007. With 140 registered attendees from many states and several countries, including Canada and the United Kingdom, the conference was a hot destination for a previously isolated group of developers.

Network connectivity was a challenge for the Georgia Center for Continuing Education, but the hyperconnected group kept things interesting and the attendees coordinated by Roy Tennant artfully architected workarounds and improvements as the conference progressed.

In a nice mixture of emerging conference trends, code4lib combined the flexibility of the unconference with 20 minute prepared talks, keynotes, five minute Lightning Talks, and breakout sessions. The form was derived from Access, the Canadian library conference.

Keynotes
The conference opened with a talk from Karen Schneider, associate director for technology and research at Florida State University. She challenged the attendees to sell open source software to directors in terms of solutions it provides, since the larger issue in libraries is saving digital information. Schneider also debated Ben Ostrowsky, systems librarian at the Tampa Bay Library Consortium, about the importance of open source software from the stage, to which Ostrowsky responded, “Isn’t that Firefox [a popular open source browser] you’re using there?”

Erik Hatcher, author of Lucene in Action, gave a keynote about using the full-text search server, Apache Solr, open-source search engine Lucene and faceted browser, Flare, to construct a new front-end to library catalog data. The previous day, Hatcher led a free preconference for 80 librarians who brought exported MARC records, including Villanova University and the University of Virginia.

Buzz
One of the best-received talks revolved around BibApp, an “institutional bibliography” written in Ruby on Rails by Nate Vack and Eric Larson, two librarians at the University of Wisconsin-Madison. The prototype application is available for download, but currently relies on citation data from engineering databases to construct a profile of popular journals, publishers, citation types, and who researchers are publishing with. “This is copywrong, which is sometimes what you have to do to construct digital library projects. Then you get money to license it,” Larson said.

More controversially, Luis Salazar gave a talk about using Linux to power public computing in the Howard County (MD) public library system. A former NSA systems administrator, he presented the pros and cons of supporting 300 staff and 400 public access computers using Groovix, a customized Linux distribution. Since the abundant number of computers serves the public without needing sign up sheets, “patrons are able to sit down and do what they want.”

Salazar created a script for monitoring all the public computers, and described how he engaged in a dialog with a patron he dubbed “Hacker Jon,” who used the library computers to develop his nascent scripting skills. Bess Sadler, librarian and metadata services specialist at the University of Virginia, asked about the privacy implications of monitoring patrons. “Do you have a click-through agreement? Privacy Policy?” she asked. Salazar joked that “It’s Maryland, we’re like a communist country” and said he wouldn’t do anything in a public library that he wouldn’t expect to be monitored.

Casey Durfee presented a talk on “Endeca in 250 lines of code or less,” which showed a prototype of faceted searching at the Seattle Public Library. The new catalog front-end sits on top of a Horizon catalog, and uses Python and Solr to present results in an elegant display, from a Google-inspired single search box start to rich subject browse options.

The future
This year’s sponsors included Talis, LibLime, OCLC, Logical Choice Technologies, and Oregon State University. OSU awarded two scholarships to Nicole Engard, Jenkins Law Library (2007 LJ Mover and Shaker), and Joshua Gomez, Getty Research Institute.

Next year’s conference will be held in Portland, OR.

Taiga 2 Forum moves into Open Space

Assistant University Librarians and Assistant Directors met for the second annual Taiga Forum a day before ALA Midwinter, Seattle, to discuss the changing dynamics of academic libraries.

In a change from last year, the participants utilized the Open Spaces structure to stage an unconference, where the conversation topics were chosen by the participants.

Topics included Search, Radical Collaboration, and Google: Friend or Foe, among others. The guiding principles were, “Whoever comes is the right person, whatever happens is the only thing that could have happened, whenever it starts is the right time, and when it’s over, it’s over.” The Endangered Species conference met in an adjoining conference room.

Meg Bellinger, Yale University Associate University Librarian, said, “We came away with the sense that we don’t have all of the answers but we all share the same problems. We must spend time moving beyond the current issues towards solutions.”

The meeting was sponsored by Innovative Interfaces, Inc.

Casey Bisson named one of first winners of Mellon Award for Technology Collaboration

Casey Bisson, information architect at Plymouth State University, was presented with a $50,000 Mellon award for Technology Collaboration by Tim Berners-Lee at the Coalition for Networked Information meeting in Washington DC December 4.

His project, WP-OPAC, is seen as the first step for allowing library catalogs to integrate with WordPress, a popular open-source content management system.

The awards committee included Mitchell Baker, Mozilla; Tim Berners-Lee,W3; Vinton Cerf, Google; Ira Fuchs, Mellon; John Gage, Sun Microsystems; Tim O’Reilly, O’Reilly Media; John Seely Brown, and Donald Waters, Mellon. Berners-Lee said, “These awards are about open source. It’s a good thing because it makes our lives easier, and the award winners used open source to solve problems.”

Library of Congress?
The revolutionary part of the announcement, however, was that Plymouth State University would use the $50,000 to purchase Library of Congress catalog records and redistribute them free under a Creative Commons Share-Alike license or GNU. OCLC has been the source for catalog records for libraries, and its license restrictions do not permit reuse or distribution. However, catalog records have been shared via Z39.50 for several years without incident.

“Libraries’ online presence is broken. We are more than study halls in the digital age. For too long, libraries have have been coming up with unique solutions for common problems,” Bisson said. “Users are looking for an online presence that serves them in the way they expect.” He said “The intention is to bring together the free or nearly-free services available to the user.”

Free download
Bisson said Plymouth State University is committed to supporting it, and will be offering it as a free download from its site, likely in the form of sample records plus WordPress with WP-OPAC included. “With nearly 140,000 registered users of Amazon Web Services, it’s time to use common solutions for our unique problems,” Bisson said.

The internal data structure works with iCal for calendar information and Flickr for photos, and can be used with historical records. It allows libraries to go beyond Library of Congress subject headings. Bisson said. Microformats are key to the internal data, and the OpenSearch API is used for interoperability. Bisson is looking at adding unAPI and OAI in the future.

At this time, there is no connection to the University of Rochester Mellon-funded project which is prototyping a new extensible catalog, though both are funded by Mellon. [see LJ Baker’s Smudges, 9/1/2006]

Other winners include:Open University (Moodle), RPI (bedework), University of British Columbia Vancouver (Open Knowledge Project), Virginia Tech (Sakai), Yale (CAS single signon), University of Washington (pine and IMAP), Internet Archive (Wayback Machine), and Humboldt State University (Moodle).

LITA National Forum 2006

“Shift Happens”
Preservation, entertainment in the library, and integrating Library 2.0 into a Web 2.0 world dominated the Library and Information Technology Association (LITA) National Forum in Nashville, TN, October 26-29, 2006.
With 378 registered attendees from 43 states and several countries, including Sweden and Trinidad, attendance held steady with previous years, though the Internet Librarian conference, held in the same week, attracted over 1300 librarians.

Free wireless has still not made it into technology conferences, though laptops were clearly visible, and the LITA blog faithfully kept up with sessions for librarians who were not able to attend.

Keynotes
The forum opened with a fascinating talk from librarians at the Country Music Hall of Fame entitled “Saving America’s Treasures.” Using Bridge Media Solutions in Nashville as a technology partner, the museum has migrated unique content from the Grand Ole Opry, including the first known radio session from October 14, 1939, as well as uncovering demos on acetate and glass from Hank Williams. The migration project uses open source software and will generate MARC records that will be submitted to OCLC.

Thom Gillespie of Indiana University described his shift from being a professor in the Library and Information Science program to launching a new program from the Telecommunications department. The MIME program for art, music, and new media has propelled students into positions at Lucas Arts, Microsoft, and other gaming companies. Gillespie said the program has practical value, “Eye candy was good but it’s about usability.” Saying that peering in is the first step but authoring citizen media is the future, he posed a provocative question: “What would happen if your library had a discussion of the game of the month?”

Buzz
Integration into user environments was a big topic of discussion. Peter Webster of St. Mary’s University, Halifax, Canada, spoke about how embedded toolbars are enabling libraries to enter where users search.

Annette Bailey, digital services librarian at Virginia Tech, announced that the LibX project has received funding for two years from IMLS to expand their research toolbar into Internet Explorer as well as Firefox, and will let librarians build their own test editions of toolbars online.

Presenters from the Los Alamos National Laboratory described their work with MPEG-21, a new standard from the Motion Pictures Experts group. The standard reduces some of the ambiguities of METS, and allows for unique identifiers in locally-loaded content. Material from Biosis, Thomson’s Web of Science, APS, the Institute of Physics, Elsevier, and Wiley, is being integrated into cataloging operations and existing local Open Archives Initiative (OAI) repositories.

Tags and Maps
The University of Rochester has received funding for an open source catalog, which they are calling the eXtensible Catalog (xC). Using an export of 3 million records from their Voyager catalog, David Lindahl and Jeff Susczynski described how their team used User Centered Design to conduct field interviews with their users, sometimes in their dorm rooms. They have prototyped four different versions of the catalog, and CUPID 4 includes integration of several APIs, including Google, Amazon, Technorati, and OCLC’s xISBN. They are actively looking for partners for the next phase, and plan to work on issues with diacritics, incremental updates, and integrating holdings records, potentially using the NCIP protocol.

Challenge
Steven Abram, of Sirxi/Dynix and incoming SLA president, delivered the closing keynote, “Web 2.0 and Library 2.0 in our Future.” Abram and Sirsi/Dynix have conducted research on 15,000 users, which highlighted the need for community, learning, and interaction. He asked the audience, “Are you working in your comfort zone or my end user’s comfort zone?” In a somewhat controversial set of statements, Abram compared open source software to being “free like kittens” and challenged librarians about the “My OPAC sucks” meme that’s been popular this year. “Do your users want an OPAC, or do they want information?” Stating that libraries need to compete in an era when education is moving towards the distance learning model, Abram asked, “How much are we doing to serve the user when 60-80% of users are virtual?” Saying that librarians help people improve the quality of their questions, Abram said that major upcoming challenges include 50 million digitized books coming online in the next five years. “What is at risk is not the book. It’s us: librarians.”