Subscribe rss.gif

Recent News


Highlights from the Linked Data for Libraries all users group meeting

On Nov. 7, in our second 2018 Users Group virtual gathering, MCLS hosted an hour-long webinar featuring two speakers. Presiding over the conversation was Andrea Kappler, Committee Chair, MCLS Linked Data for Libraries Users Group Steering Committee (Evansville Vanderburgh Public Library).

We first heard from Andrew Weidner, Digital Operations Coordinator for the University of Houston Libraries, in a presentation entitled Planting Cedar: Local Authority Control in a Linked Data Ecosystem. He described the development of a sequence of preservation, archival, and metadata softwares, which when combined with complementary workflow, yield a platform from which existing linked data may be utilized and new linked data may be generated.

The workflow, in turn, utilized an automated vocabulary manager/thesaurus (named Cedar), through which new metadata is validated. The application for metadata (called Brays in the U-H system), provides an image viewer, a metadata display, and the validation path. Existing linked data is mined from familiar sources like Library of Congress (LC) authorities and the Virtual International Authority File (VIAF), while new authority URIs are created through a persistent identifier (Archival Resource Key) generator.

This modular ecosystem approach has the potential to be adapted through other softwares that may already be in place in other (particular higher education) contexts. Weidner closed with a look to the future, where the system could ultimately be scaled up to include an interface with extant triples stores; where linked data fragments become searchable; and where the developed resource might become more widely accessible through a Hyrax repository.

Pertinent links to related presentations:

The second presenter was Andrew Pace, OCLC’s Executive Director of Technical Research. His presentation, entitled Linked Data: From Promise to Progress, described a prototype pilot project engaging some 16 institutional partners, and involving OCLC’s Research, Global Technology, and Global Product Management units.

After building a context in which OCLC’s ongoing commitment to linked data must also include an indefinite ongoing support for MARC, he described the project, which ran from December 2017 to September 2018. Building upon known capabilities apparent in Wikipedia and Wikidata disambiguation routines, the project used MediaWiki’s extant search, autosuggest, APIs, multilingual user interface, and other features, then combined them with Wikibase’s capacities for linked data, SPARQL, and structured data editing.

The resulting resource facilitated normalization of disparate data sources, efficiencies through autosuggest and a greater number of querying options. Further workflow enhancement came through the usage of OpenRefine in conjunction with new APIs, and another layer of batch processing through Pywikibot.

Pace spoke of the pilot project’s discoveries, which included the desirability of partnerships; the importance of providing access to more obscure linked data, only available in less visible places, and only identifiable through community input; and the ways that linked data differs from traditional library authority work.

Forthcoming from OCLC are more publications and presentations, involving development of more prototypes. The guiding principles of this ongoing research include a need for scalability for multiple contexts; ease of access, input, and output; and an absolute focus on service and Web visibility.

The two presentations were followed by an informal question-and-answer session. Information and recordings from this meeting and previous Linked Data for Libraries Users Group meetings can be found on the Group’s webpage.

We on the Steering Committee continue to look for new presenters for future Users Group gatherings, and are always on the lookout for intriguing approaches to the use of linked data in libraries and beyond (send your ideas to me at!). As a committee, we connect monthly, and host two virtual All Users gatherings annually. Although we know that development seems slow, these presentations reminded us that a lot of resources are going into research and development at particular institutions, and it is just a matter of time before this work affects those of us in less-well-equipped contexts. The IT energy going into these efforts will ultimately increase the accessibility, procurement, and distribution of all manner of resources, and we look forward to that day when this shared data will enhance our collective life together.

Aaron Smith
Assistant Manager, Support Services
The Genealogy Center
Allen County Public Library (IN)