Open Libraries “… are signs of life and hope: They are the cornerstone of democracy”

Posts Tagged Open Archives Initiative

Presenting at ALA panel on Future of Information Retrieval

The Future of Information Retrieval

Ron Miller, Director of Product Management, HW Wilson, hosts a panel of industry leaders including:Mike Buschman, Program Manager, Windows Live Academic, Microsoft.R. David Lankes, PhD, Director of the Information Institute of Syracuse, and Associate Professor, School of Information Studies, Syracuse University.Marydee Ojala, Editor, ONLINE, and contributing feature and news writer to Information Today, Searcher, EContent, Computers in Libraries, among other publications.Jay Datema, Technology Editor, Library Journal

Add to calendar:Monday, 25 June 2007, 8-10 a.m, Room 103bPreliminary slides and audio attached.


 
icon for podpress  Open Libraries Presentation: Play Now | Play in Popup | Download

Open source metasearch

Now there’s a new kid on the (meta)search block. LibraryFind, an open-source project funded by the State Library of Oregon, is currently live at Oregon State University. The library has just packaged up a release for anyone to download and install.

Jeremy Frumkin, Gray chair for Innovative Library Services at OSU, said the goals were to contribute to the support of scholarly workflow, remove barriers between the library and Web information, and to establish the digital library as platform.

Lead developers Dan Chudnov, soon to join the Library of Congress’s Office of Strategic Initiatives, and Terry Reese, catalog librarian and developer of popular application MarcEdit, worked with the following guiding principles: Two clicks–one to find, and one to get; a goal of getting results in four seconds, and known and adjustable results ranking.

Other OSU project members included Tami Herlocker, point person for interface development, and Ryan Ordway, system administrator. Frumkin said, “The Ruby on Rails platform provided easy, quick user interface development. It gives a variety of UI possibilities, and offers new interfaces for different user groups.”

The application includes collaborations on the OpenURL module from Ross Singer, library applications developer at the Georgia Tech library, and Ed Summers, Library of Congress developer. Journal coverage can be imported from a SerialsSolutions export, and more import facilities are planned in upcoming releases.

OSU is working on a contract with OCLC’s WorldCat to download data, and is looking to build greater trust relationships with vendors. “The upside for vendors is they can see how their data is used when developing new services,” Frumkin said.

Future enhancements include an information dashboard and a personal digital library. Developers are also staffing a support chatroom for technical support, help, and development discussion of LibraryFind.



Casey Bisson named one of first winners of Mellon Award for Technology Collaboration

Casey Bisson, information architect at Plymouth State University, was presented with a $50,000 Mellon award for Technology Collaboration by Tim Berners-Lee at the Coalition for Networked Information meeting in Washington DC December 4.

His project, WP-OPAC, is seen as the first step for allowing library catalogs to integrate with WordPress, a popular open-source content management system.

The awards committee included Mitchell Baker, Mozilla; Tim Berners-Lee,W3; Vinton Cerf, Google; Ira Fuchs, Mellon; John Gage, Sun Microsystems; Tim O’Reilly, O’Reilly Media; John Seely Brown, and Donald Waters, Mellon. Berners-Lee said, “These awards are about open source. It’s a good thing because it makes our lives easier, and the award winners used open source to solve problems.”

Library of Congress?
The revolutionary part of the announcement, however, was that Plymouth State University would use the $50,000 to purchase Library of Congress catalog records and redistribute them free under a Creative Commons Share-Alike license or GNU. OCLC has been the source for catalog records for libraries, and its license restrictions do not permit reuse or distribution. However, catalog records have been shared via Z39.50 for several years without incident.

“Libraries’ online presence is broken. We are more than study halls in the digital age. For too long, libraries have have been coming up with unique solutions for common problems,” Bisson said. “Users are looking for an online presence that serves them in the way they expect.” He said “The intention is to bring together the free or nearly-free services available to the user.”

Free download
Bisson said Plymouth State University is committed to supporting it, and will be offering it as a free download from its site, likely in the form of sample records plus WordPress with WP-OPAC included. “With nearly 140,000 registered users of Amazon Web Services, it’s time to use common solutions for our unique problems,” Bisson said.

The internal data structure works with iCal for calendar information and Flickr for photos, and can be used with historical records. It allows libraries to go beyond Library of Congress subject headings. Bisson said. Microformats are key to the internal data, and the OpenSearch API is used for interoperability. Bisson is looking at adding unAPI and OAI in the future.

At this time, there is no connection to the University of Rochester Mellon-funded project which is prototyping a new extensible catalog, though both are funded by Mellon. [see LJ Baker's Smudges, 9/1/2006]

Other winners include:Open University (Moodle), RPI (bedework), University of British Columbia Vancouver (Open Knowledge Project), Virginia Tech (Sakai), Yale (CAS single signon), University of Washington (pine and IMAP), Internet Archive (Wayback Machine), and Humboldt State University (Moodle).



LITA National Forum 2006

“Shift Happens”
Preservation, entertainment in the library, and integrating Library 2.0 into a Web 2.0 world dominated the Library and Information Technology Association (LITA) National Forum in Nashville, TN, October 26-29, 2006.
With 378 registered attendees from 43 states and several countries, including Sweden and Trinidad, attendance held steady with previous years, though the Internet Librarian conference, held in the same week, attracted over 1300 librarians.
Free wireless has still not made it into technology conferences, though laptops were clearly visible, and the LITA blog faithfully kept up with sessions for librarians who were not able to attend.
Keynotes
The forum opened with a fascinating talk from librarians at the Country Music Hall of Fame entitled “Saving America’s Treasures.” Using Bridge Media Solutions in Nashville as a technology partner, the museum has migrated unique content from the Grand Ole Opry, including the first known radio session from October 14, 1939, as well as uncovering demos on acetate and glass from Hank Williams. The migration project uses open source software and will generate MARC records that will be submitted to OCLC.
Thom Gillespie of Indiana University described his shift from being a professor in the Library and Information Science program to launching a new program from the Telecommunications department. The MIME program for art, music, and new media has propelled students into positions at Lucas Arts, Microsoft, and other gaming companies. Gillespie said the program has practical value, “Eye candy was good but it’s about usability.” Saying that peering in is the first step but authoring citizen media is the future, he posed a provocative question: “What would happen if your library had a discussion of the game of the month?”
Buzz
Integration into user environments was a big topic of discussion. Peter Webster of St. Mary’s University, Halifax, Canada, spoke about how embedded toolbars are enabling libraries to enter where users search.
Annette Bailey, digital services librarian at Virginia Tech, announced that the LibX project has received funding for two years from IMLS to expand their research toolbar into Internet Explorer as well as Firefox, and will let librarians build their own test editions of toolbars online.
Presenters from the Los Alamos National Laboratory described their work with MPEG-21, a new standard from the Motion Pictures Experts group. The standard reduces some of the ambiguities of METS, and allows for unique identifiers in locally-loaded content. Material from Biosis, Thomson’s Web of Science, APS, the Institute of Physics, Elsevier, and Wiley, is being integrated into cataloging operations and existing local Open Archives Initiative (OAI) repositories.
Tags and Maps
The University of Rochester has received funding for an open source catalog, which they are calling the eXtensible Catalog (xC). Using an export of 3 million records from their Voyager catalog, David Lindahl and Jeff Susczynski described how their team used User Centered Design to conduct field interviews with their users, sometimes in their dorm rooms. They have prototyped four different versions of the catalog, and CUPID 4 includes integration of several APIs, including Google, Amazon, Technorati, and OCLC’s xISBN. They are actively looking for partners for the next phase, and plan to work on issues with diacritics, incremental updates, and integrating holdings records, potentially using the NCIP protocol.
Challenge
Steven Abram, of Sirxi/Dynix and incoming SLA president, delivered the closing keynote, “Web 2.0 and Library 2.0 in our Future.” Abram and Sirsi/Dynix have conducted research on 15,000 users, which highlighted the need for community, learning, and interaction. He asked the audience, “Are you working in your comfort zone or my end user’s comfort zone?” In a somewhat controversial set of statements, Abram compared open source software to being “free like kittens” and challenged librarians about the “My OPAC sucks” meme that’s been popular this year. “Do your users want an OPAC, or do they want information?” Stating that libraries need to compete in an era when education is moving towards the distance learning model, Abram asked, “How much are we doing to serve the user when 60-80% of users are virtual?” Saying that librarians help people improve the quality of their questions, Abram said that major upcoming challenges include 50 million digitized books coming online in the next five years. “What is at risk is not the book. It’s us: librarians.”



Using Drupal to put Endnote online

There is still no easy way to manage a library of references on a personal or institutional site. Librarians who want to put up a list of institutional publications, or researchers who want to share references are limited by existing software limitations, privacy concerns, or technical road blocks. This problem has been mitigated by a open source CMS with a handy bibliographic data module.

The Drupal content management system is attractive to many librarians and information scientists because of its deep use of taxonomy. Daniel Chudnov uses it to power Open Source Systems for Libraries, and his personal weblog, One Big Library. Roy Tennant uses Drupal for the TechEssence.info, and the Ann Arbor Public Library uses it for user registration, resource weblogs, and the overall site.

However, state of the art in bibliographic management and collaboration is still stuck in 1990. When a writer wants to collect articles, there are a number of client applications (all owned by Thomson ISI ResearchSoft, including Endnote, ProCite, and Reference Manager, plus WriteNote) that do a nice job of saving the references and integrating with word processors to format the citations.Endnote is the most commonly-used program, but it was not designed to share references. Modern science is all about collaboration, from grant proposals to international research. In the worst case, sharing an Endnote library on a network server can cause corruption. In the best case, shared Endnote libraries are limited to read-only if another person has it open, which limits collaboration.

A version of EndnoteWeb has been in development for most of 2006, and is promised by January of next year. Early reports of integration with Web of Science tell of limited functionality and interoperability.In 2002, a number of former Reference Manager employees waited for their non-compete agreements with ISI to expire, then founded RefWorks, an online version of the familiar bibliographic managers.In the last two years, applications including Connotea and CiteULike have integrated bilbiographic manager capabilities to their social bookmarking applications. Both allow RIS and BibTeX upload and download to systems managed at Nature Publishing Group and the University of Manchester, respectively.

At Cold Spring Harbor Laboratory the annual reports of the institution have listed lab publications for over 100 years. These references have not been added to Pubmed, which still only goes back to 1950. Thus, this unique information needed to be put into a format so that scholars could cite the early history of genetics, and the tragic misfire of eugenics research.

Many approachs were tried. One early method was programmer-centric, where the data was entered into a SQL database and a web front-end was scripted to add basic fields. While this was a promising start, it left out the rich data fields that enable bibliographic managers to capture complete citation information.

Since the library was examining digital asset management systems, Greenstone was assessed for its citation abilities. Ian Witten was able to jury-rig a solution that imported RIS information about citations, but getting them to display in a full way wasn’t simple.

As the prototyping continued, the initial database of 1800 records was exported out of the SQL database into comma separated value (CSV) format, and imported into Endnote. The archives clerk started assessing the reference types, and added new fields. For example, Institution was added so that a sort by the name could be used. A new reference type was added for non-standard reports.

In the process of adding this information, Endnote’s integration with OpenURL became useful. Using the standard bibliographic fields, it was possible to launch a search that queried the library’s subscriptions to see if a full-text version existed. And for many articles in Science magazine, a full-text scan was available.In the short-term, links to the JSTOR archive were added to Endnote. Longer-term, it would be useful to put in COinS from the web interface so that every citation could be queried via OpenURL.

Cold Spring Harbor Laboratory already had a site license for Endnote, so switching to RefWorks wasn’t feasible. In addition, the local version of Connotea isn’t exacly lightweight to deploy, requriing two MySQL databases and memcached to handle the online load. Since Nature is currently funding the open-source project, questions were raised about the continuting development of the project.

The archives clerk finished the authority control work on the Endnote database, which included hand-checking the references to the print version of the annual reports. Once this was completed, a need was voiced to make these references available online.

Ron Jerome of the National Research Council Canada Institute for Chemical Process and Environmental Technology wrote a Bibliography module for Drupal which allows Endnote import in .enw or XML formats. This module is currently being extended to allow Open Archives Initiative harvesting.

This module was installed, and the 2200 Cold Spring Harbor Laboratory publications from 1890-1950 were imported into MySQL. The display is clear, and the default display is citation format. All other fields were imported, but live in the database for display on demand.

This module holds great promise for archive integration, since harvesting by OAI would allow libraries to harvest the records from web resources that aren’t specifically enabled for archives management. Endnote format is a lowest barrier format for scientists and researchers.

In the future, Cold Spring Harbor Laboratory hopes to integrate these early records with the other archives collections managed by Digitool. For now, other laboratories and libraries can use Drupal and the Bibliography module for easy reference sharing.



← Before