Open Source

ALA 2007: Top Tech Trends

At the ALA Top Tech Trends Panel, panelists including Marshall Breeding, Roy Tennant, Karen Coombs, and John Blyberg discussed RFID, open source adoption in libraries, and the importance of privacy.

Marshall Breeding, director for innovative technologies and research at Vanderbilt University Libraries (TN), started the Top Tech Trends panel by referencing his LJ Automation Marketplace article, “An Industry Redefined,” which predicted “unprecedented disruption” in the ILS market. Breeding said 60 percent of the libraries in one state are facing a migration due to the Sirsi/Dynix product roadmap being changed, but he said “not all ILS companies are the same.”

Breeding said open source is new to the ILS world as a product, even though it’s been used as infrastructure in libraries for many years. Interest has now expanded to the decision makers. The Evergreen PINES project in Georgia, with 55 of 58 counties participating, was “mostly successful.” With the recent decision to adopt Evergreen in British Columbia, there is movement to open source solutions, though Breeding cautioned it is “still miniscule compared to most libraries.”

Questioning the switch being compared to an avalanche, Breeding said several commercial support companies have sprung up to serve the open source ILS market, including Liblime, Equinox, and CARe Affiliates. Breeding predicted an era of “new decoupled interfaces.”

John Blyberg, head of technology and digital initiatives at Darien Public Library (CT), said the “back end [in the ILS] needs to be shored up because it has a ripple effect” on other services. Blyberg said RFID is coming, and it makes sense for use in sorting and book storage, echoing Lori Ayre’s point that libraries “need to support the distribution demands of the Long Tail.” Feeling that “privacy concerns are non-starters, because RFID is essentially a barcode,” he said the RFID information is stored in a database, which should be the focus of security concerns.

Finally, Blyberg said that vendor interoperability and a democratic approach to development is needed in the age of Innovative’s Encore and Ex Libris’ Primo, both which can be used with different ILS systems and can decouple the public catalog from the ILS. With the xTensible catalog (xC) and Evergreen coming along, Blyberg said there was a need for funding and partners to further enhance their development.

Walt Crawford of OCLC/RLG said the problem with RFID is the potential of having patron barcodes chipped, which could “lead to the erosion of patron privacy.” Intruders could datamine who’s reading what, which Crawford said is a serious issue.

Joan Frye Williams countered that both Blyberg and Crawford were “insisting on using logic on what is essentially a political problem.” Breeding agreed, saying that airport security could scan chips, and “my concern is that third generation RFID chips may not be readable in 30 years, much less the hundreds of years that we expect barcodes to be around for.”

Karen Coombs, head of web services at the University of Houston (TX), listed three trends:
• The end user as content contributor, which she cautioned was an issue. “What happens if YouTube goes under and people lose their memories?” Coombs pointed to the project with the National Library of Australia and its partnership with Flickr as a positive development.
• Digital as format of choice for users, pointing out iTunes for music and Joost for video. Coombs said there is currently “no way for libraries to provide this to users, especially in public libraries.” Though companies like Overdrive and Recorded Books exist to serve this need, perhaps her point was that the consumer adoption has superseded current library demand.
• A blurred line between desktop and web applications, which Coombs demonstrated with YouTube remixer and Google Gears, “which lets you read your feeds when you’re offline.”

John Blyberg responded to these trends, saying that he sees academic libraries pursuing semantic web technologies, including developing ontologies. Coombs disagreed with this assessment, saying that “libraries have lots of badly-tagged HTML pages.” Roy Tennant agreed, “If the semantic web arrives, buy yourself some ice skates, because hell will have frozen over.”

Breeding said that he longs for “SOA [services-oriented architecture] but I’m not holding my breath.” And Walt Crawford said, “Roy is right—most content providers don’t provide enough detail, and they make easy things complicated and don’t tackle the hard things.” Coombs pointed out, “People are too concerned with what things look like,” but Crawford interjected, “not too concerned.”

Roy Tennant, OCLC senior program manager, listed his trends:
• Demise of the catalog, which should push the OPAC into the back room where it belongs and elevate discovery tools like Primo and Encore, as well as OCLC WorldCat Local.
• Software as a Service (SaaS), formerly known as ASP and hosted services, which means librarians “don’t have to babysit machines, and is a great thing for lots of librarians.”
• Intense marketplace uncertainty due to the private equity buyouts of ExLibris and SirsiDynix and the rise of Evergreen and Koha looming options. Tennant also said he sees “WorldCat Local as a disruptive influence.” Aside from the ILS, the abstract and indexing (A&I) services are being disintermediated as Google and OCLC are going direct to publishers to license content.
Someone asked if libraries should get rid of local catalogs, and Tennant said “only when it fits local needs.”

Walt Crawford said:
• Privacy still matters. Crawford questioned if patrons really wanted libraries to turn into Amazon in an era of government data mining and inferences which could track a ten year patron borrowing pattern.
• The slow library movement, which argues that locality is vital to libraries, mindfulness matters, and open source software should be used “where it works”
• The role of the public library as publisher. Crawford pointed out libraries in Charlotte-Mecklenberg County, libraries in Vermont that Jessamyn West works with, and Wyoming as farther along this path, and said the “tools are good enough that it’s becoming practical.”

Blyberg said that systems “need to be more open to the data that we put in there.” Williams said that content must be “disaggregatable and remixable, and Coombs pointed out the current difficulty of swapping out ILS modules, and said ERM was a huge issue. Tennant referenced the Talis platform, and said one of Evergreen’s innovations is its use of the XMPP (Jabber) protocol, which is “easier than SOAP web services, which are too heavyweight.”

Marshall Breeding responded to a question asking if MARC was dead, saying “I’m married to a cataloger, but we do need things in addition to MARC, which is good for books, like Dublin Core and ONIX.” Coombs pointed out that MARCXML is a mess because it’s retrofitted and doesn’t leverage the power of XML. Crawford said, “I like to give Roy [Tennant] a hard time about his phrase ‘MARC is dead,” and for a dying format, the Moen panel was full at 8 a.m.

Questioners asked what happens when “the one server” goes down, and Blyberg responded, “What if your T-1 line goes down?” Joan Frye Williams exhorted the audience to “examine your consciences when you ask vendors how to spend their time.” Coombs agreed, saying that her experience on user groups had exposed her to “crazy competing needs that vendors are faced with—[they] are spread way too thin.” Williams said there are natural transition points and she spoke darkly of a “pyramid scheme” and that you “get the vendors you deserve.” Coombs agreed, saying, “Feature creep and managing expectations is a fiercely difficult job, and open source developers and support staff are different people.”

Joan Frye Williams, information technology consultant, listed:
• New menu of end-user focused technologies. Williams said she worked in libraries when the typewriter was replaced by an OCLC machine, and libraries are still not using technology strategically. “Technology is not a checklist,” Williams chided, saying that the 23 Things movement of teaching new skills to library staff was insufficient.
• Ability for libraries to assume development responsibility in concert with end-users
• Have to make things more convenient, adopting (AI) artificial intelligence principles of self-organizing systems. Williams said, “If computers can learn from their mistakes, why can’t we?”

Someone asked why libraries are still using the ILS. Coombs said it’s a financial issue, and Breeding responded sharply, saying, “How can we not automate our libraries?” Walt Crawford agreed, saying, “Are we going to return to index cards?”
When the panel was asked if library home pages would disappear, Crawford and Blyberg both said they would be surprised. Williams said “the product of the [library] website is the user experience.” She said Yorba Linda Public Library (CA) is enhancing their site with a live book feed that updates “as books are checked in, a feed scrolls on the site.”

And another audience member asked why the panel didn’t cover toys and protocols. Crawford said “outcomes matter,” and Coombs agreed, saying “I’m a toy geek but it’s the user that matters.” Many participants talked about their use of Twitter, and Coombs said portable applications on a USB drive have the potential to change public computing in libraries. Tennant recommended viewing the Photosynth demo, first shown at the TED conference.
Finally, when asked how to keep up with trends, especially for new systems librarians, Coombs said, “It depends what kind of library you’re working in. Find a network—ask questions on the code4lib [IRC] channel.”

Blyberg recommended constructing a “well-rounded blogroll” that includes sites from the humanities, sciences, and library and information science will help you be a well-rounded feed reader.” Tennant recommended a “gasp—dead tree magazine, Business 2.0,” Coombs said the Gartner website has good information about technology adoptions, and Williams recommended trendwatch.com.

Links to other trends:
Karen Coombs’ Top Technology Trends
Meredith Farkas’ Top Technology Trends
3 Trends and a Baby (Jeremy Frumkin)
Some Trends from the LiB (Sarah Hougton-Jan)
“Sum” Top Tech Trends for the Summer of 2007 (Eric Lease Morgan)

And other writeups and podcast:
Rob Styles
Ellen Ward
Chad Haefele

Open source metasearch

Now there’s a new kid on the (meta)search block. LibraryFind, an open-source project funded by the State Library of Oregon, is currently live at Oregon State University. The library has just packaged up a release for anyone to download and install.

Jeremy Frumkin, Gray chair for Innovative Library Services at OSU, said the goals were to contribute to the support of scholarly workflow, remove barriers between the library and Web information, and to establish the digital library as platform.

Lead developers Dan Chudnov, soon to join the Library of Congress’s Office of Strategic Initiatives, and Terry Reese, catalog librarian and developer of popular application MarcEdit, worked with the following guiding principles: Two clicks–one to find, and one to get; a goal of getting results in four seconds, and known and adjustable results ranking.

Other OSU project members included Tami Herlocker, point person for interface development, and Ryan Ordway, system administrator. Frumkin said, “The Ruby on Rails platform provided easy, quick user interface development. It gives a variety of UI possibilities, and offers new interfaces for different user groups.”

The application includes collaborations on the OpenURL module from Ross Singer, library applications developer at the Georgia Tech library, and Ed Summers, Library of Congress developer. Journal coverage can be imported from a SerialsSolutions export, and more import facilities are planned in upcoming releases.

OSU is working on a contract with OCLC’s WorldCat to download data, and is looking to build greater trust relationships with vendors. “The upside for vendors is they can see how their data is used when developing new services,” Frumkin said.

Future enhancements include an information dashboard and a personal digital library. Developers are also staffing a support chatroom for technical support, help, and development discussion of LibraryFind.

Dreaming in Code (review)

Dreaming in Code Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent SoftwareScott Rosenberg; Crown 2007WorldCatLibraryThingGoogle BooksBookFinder  Salon’s Scott Rosenberg has written an elegant bird’s eye view of modern software development by observing the development of Chandler, an open source calendaring project. It was originally publicized as a way to kill the Exchange server hegemony in much the same way that Apache has dominated Microsoft’s IIS.

Yet as the subtitle says, “two dozen programmers, three years, 4,732 bugs, and one quest for transcendent software” hasn’t yet resulted in a product ready for general consumption.

The detours have been interesting. We witness the birth of PyLucene, as developers seek a full-text indexing solution that works with their unified repository. And perhaps CalDAV, soon to ship with OS X’s Leopard, will be the project’s legacy.

It’s a compelling vision: a type-agnostic program to manage email, calendar events, and contacts. Yet Google chose dis-integration with its calendar and Gmail. And Apple has made backend data integration possible, but has kept the individual applications separate.

As the project enters its third year, Rosenberg takes a detour into the history of software development. After surveying the hilltop, he makes a modest recommendation. Computer science programs should be more like MFA programs, which require students to study great works, share work, and revise constantly.

During this chapter, 37 Signals’s Getting Real methodology is held up, along with The Joel Test for software development as possible signposts on the road ahead. Since Ruby on Rails came from a simple tasks list, perhaps there is some life in Getting Real for complicated projects, too.

In fact, the scenery is often as enjoyable as the narrative. I was happy to learn that CivicSpace, a Drupal module/modification came from Chandler’s benevolent dictator-for-life, Mitch Kapor. An excerpt from the book is up at Technology Review that delves into the history of Hungarian notation.

As the Chandler project continues to take shape, one ponders the irony that if the developers had been using a completed program that fulfilled the dream, their project might be done already. The hardest software to finish may be that which measures time. Perhaps we need the next Proust to reinvent computer science. Until then, Dreaming in Code will have to suffice.

NetConnect Winter 2007 podcast episode 2

This is the second episode of the Open Libraries podcast, and I was pleased to have the opportunity to talk to some of the authors of the Winter netConnect supplement, entitled Digitize This!

The issue covers how libraries can start to digitize their unique collections. K. Matthew Dames and Jil Hurst-Wahl wrote an article about copyright and practical considerations in getting started. They join me, along with Lotfi Belkhir, CEO of Kirtas Technologies, to discuss the important issue of digitization quality.

One of the issues that has surfaced recently is exactly what libraries are receiving from the Google Book Search project. As the project grows beyond the initial five libraries into more university and Spanish libraries, many of the implications have become more visible.

The print issue of NetConnect is bundled with the January 15th issue of Library Journal, or you can read the articles online.

Recommended Books:
Kevin
Knowledge Diplomacy

Jill
Business as Unusual

Lotfi
Free Culture
Negotiating China
The Fabric of the Cosmos

Software
SuperDuper
Google Documents
Arabic OCR

0 Music and Intro
1:59 Kevin Dames on his weblog Copycense
2:48 Jill Hurst-Wahl on Digitization 101
4:16 Jill and Kevin on their article
4:34 SLA Digitization Workshop
5:24 Western NY Project
6:45 Digitization Expo
7:43 Lotfi Belkhir
9:00 Books to Bytes
9:26 Cornell and Microsoft Digitization
11:00 Scanning vs Digitization
11:48 Google Scanning
15:22 Michael Keller’s OCLC presentation
16:14 Google and the Public Domain
17:52 Author’s Guild sues Google
21:13 Quality Issues
24:10 MBooks
26:56 Public Library digitization
27:14 Incorporating Google Books into the catalog
28:49 CDL contract
30:22 Microsoft Book Search
31:15 Double Fold
39:20 Print on Demand and Digitization
39:25 Books@Google
43:14 History on a Postcard
45:33 iPRES conference
45:46 LOCKSS
46:45 OAIS

Evergreen now has two support options

Evergreen, the open source ILS system in use by the Georgia PINES libraries, now has a couple of support options. The developers behind the system have launched Equinox Software, modestly billed as “The Future of Library Automation.”

The company consists of members of the Evergreen development team as well as the Georgia Assistant State Librarian, Julie Walker. Libraries are being offered custom development, hosting, migration, and support.

This is an interesting development, and brings to mind some automation history, from NOTIS originating out of Northwestern to the original Innovative software coming from UC Berkeley.

Tag, you’re it

Another interesting tagging project from the art world is steve.museum, billed as “the first experiment in social tagging of museum collections,” which has recently been funded by IMLS for two years.

At a 17 November New York Technical Services Librarians meeting, Susan Chun of the Metropolitan Musuem of Art said steve solves the problem of “additional access points, multilingual information, and things that aren’t often included in art catalog records, like color.” Though the audience was somewhat skeptical, Chun said “steve won’t replace anything, and tags must exist alongside traditional cataloging.”

Though tags like “you will die” may have nebulous value, the Met found that 92% of tags added new information that wasn’t present in traditional sources.

Active since 2005, the tag collection is being studied by social scientists at Princeton and the University of Michigan. Questions being studied include “What produces good tags?” and looking at types and tag clusters using deduping and stemming analysis.

The project includes an open API and open source download. Installation is quite simple (requiring PHP and MySQL), but the upload of images requires a custom XML schema for description.

LibLime powers Koha and now Evergreen

Making the link from a statewide project to something more:

LibLime is one option for libraries outside Georgia interested in Evergreen. Joshua Ferraro, LibLime president, said they provide “hosting, data migration, installation, training, and technical support” to libraries looking to switch to Evergreen. LibLime was retained by the PINES project during development to provide Quality Assurance and National Circulation Interface Protocol (NCIP) support.

Ferraro said, “[PINES] librarians are happy because both circulation and holds are up. Plus, suggestions are sometimes implemented within hours.” One benefit of open-source projects is the visibility of the bug list, where enhancements and problems can be openly tracked.

LibLime currently has four employees and partners with ten subcontractors, and is hiring another four employees for development and support with expectations of hiring four more in the near future. The company has 50 clients, including public libraries in Ohio and libraries in France, New Zealand, Switzerland, and Canada, and interest from large public library systems.

The company’s origins are in the Koha project, an open-source ILS developed for New Zealand libraries. Ferraro became familiar with Koha when he implemented it with Stephen Hedges, then-director at the Nelsonville Public Library, OH, and he was chosen as release manager for Koha version 3.

LITA National Forum 2006

“Shift Happens”
Preservation, entertainment in the library, and integrating Library 2.0 into a Web 2.0 world dominated the Library and Information Technology Association (LITA) National Forum in Nashville, TN, October 26-29, 2006.
With 378 registered attendees from 43 states and several countries, including Sweden and Trinidad, attendance held steady with previous years, though the Internet Librarian conference, held in the same week, attracted over 1300 librarians.

Free wireless has still not made it into technology conferences, though laptops were clearly visible, and the LITA blog faithfully kept up with sessions for librarians who were not able to attend.

Keynotes
The forum opened with a fascinating talk from librarians at the Country Music Hall of Fame entitled “Saving America’s Treasures.” Using Bridge Media Solutions in Nashville as a technology partner, the museum has migrated unique content from the Grand Ole Opry, including the first known radio session from October 14, 1939, as well as uncovering demos on acetate and glass from Hank Williams. The migration project uses open source software and will generate MARC records that will be submitted to OCLC.

Thom Gillespie of Indiana University described his shift from being a professor in the Library and Information Science program to launching a new program from the Telecommunications department. The MIME program for art, music, and new media has propelled students into positions at Lucas Arts, Microsoft, and other gaming companies. Gillespie said the program has practical value, “Eye candy was good but it’s about usability.” Saying that peering in is the first step but authoring citizen media is the future, he posed a provocative question: “What would happen if your library had a discussion of the game of the month?”

Buzz
Integration into user environments was a big topic of discussion. Peter Webster of St. Mary’s University, Halifax, Canada, spoke about how embedded toolbars are enabling libraries to enter where users search.

Annette Bailey, digital services librarian at Virginia Tech, announced that the LibX project has received funding for two years from IMLS to expand their research toolbar into Internet Explorer as well as Firefox, and will let librarians build their own test editions of toolbars online.

Presenters from the Los Alamos National Laboratory described their work with MPEG-21, a new standard from the Motion Pictures Experts group. The standard reduces some of the ambiguities of METS, and allows for unique identifiers in locally-loaded content. Material from Biosis, Thomson’s Web of Science, APS, the Institute of Physics, Elsevier, and Wiley, is being integrated into cataloging operations and existing local Open Archives Initiative (OAI) repositories.

Tags and Maps
The University of Rochester has received funding for an open source catalog, which they are calling the eXtensible Catalog (xC). Using an export of 3 million records from their Voyager catalog, David Lindahl and Jeff Susczynski described how their team used User Centered Design to conduct field interviews with their users, sometimes in their dorm rooms. They have prototyped four different versions of the catalog, and CUPID 4 includes integration of several APIs, including Google, Amazon, Technorati, and OCLC’s xISBN. They are actively looking for partners for the next phase, and plan to work on issues with diacritics, incremental updates, and integrating holdings records, potentially using the NCIP protocol.

Challenge
Steven Abram, of Sirxi/Dynix and incoming SLA president, delivered the closing keynote, “Web 2.0 and Library 2.0 in our Future.” Abram and Sirsi/Dynix have conducted research on 15,000 users, which highlighted the need for community, learning, and interaction. He asked the audience, “Are you working in your comfort zone or my end user’s comfort zone?” In a somewhat controversial set of statements, Abram compared open source software to being “free like kittens” and challenged librarians about the “My OPAC sucks” meme that’s been popular this year. “Do your users want an OPAC, or do they want information?” Stating that libraries need to compete in an era when education is moving towards the distance learning model, Abram asked, “How much are we doing to serve the user when 60-80% of users are virtual?” Saying that librarians help people improve the quality of their questions, Abram said that major upcoming challenges include 50 million digitized books coming online in the next five years. “What is at risk is not the book. It’s us: librarians.”

Open Source

When Linux and Apache were familiar to the sysadmin community, Dan Chudnov had the foresight to develop a site for librarians to swap tips about open source software that would be useful in libraries.

Now, with libraries contemplating the advantages of home-grown open source library catalogs anew, open source expertise has become even more valuable.

The State of Georgia has made a committment to developing an open-source catalog, so things are starting to get interesting