Interoperable and open bibliographic data
Opening up bibliographic data to new audiences can increase visibility of collections and improve discoverability.
This week the Library at Harvard University has released 12 million bibliographic records into the public domain, stating its commitment to open access and collaboration. The web has helped foster an environment where users expect content to be openly accessible (very often at no cost to them) and many institutions are doing their best to oblige them.
In 2007 a group of people interested in metadata models came together at The British Library. What became known as 'the London meeting' is now seen as a critical point in the role libraries play in linked data and the semantic web.
Five years on a group of interested parties met at the BL to reflect on the progress made, and to discuss sustainability and next steps.
The web environment meant that ways of improving bibliographic information delivery needed to be explored. New collaborations of interested parties, including IFLA, RDA and publishers as well as the semantic web community became involved in developing principles and standards which moved beyond traditional cataloguing rules towards well formed metadata sets. By 2005 RDA (Resource Description and Access) had emerged as a new content standard.
RDA focuses on entities and descriptions of them. It helps build linked clusters making relationships explicit and enabling bibliographic data to be integrated with the wider environment. Data can be mined and displayed in different ways - dependent on user needs. New pathways will be opened up to users.
Meeting the challenge of open data
The British Library's Metadata Services team developed a project that would increase the visibility of its holdings and make a critical amount of data freely available. The aim was to satisfy the various needs of different markets (researchers, linked data developers and libraries) and to remove barriers to innovation without impacting negatively on BL's revenue streams.
The decision was taken to release British National Bibliography (BNB) data. It was data that had been in a uniform format for many years and it was content which could be enriched by being linked to external sources, such as GeoNames.
Released in July 2011, over 2.6 million records were processed. Traffic to the data spiked and the project certainly had the desired effect of increasing resource visibility without impacting income. Key lessons:
- If the data is interesting, it will be used
- We should all be focusing on data modelling and sustainability
- Try not to reinvent the wheel
- Everyone is learning!
- Offer sample data, welcome feedback and continually improve
The National Library of Canada is working on a project that pools data and efforts from five partners to bring together content about the First World War.
The libraries of the University of Cambridge undertook a JISC-funded project (COMET) in 2011 which exposed almost the entire library catalogue not just to 'traditional' audiences but to non library developer communities.
Europeana will be making its metadata freely available from July 2012. The Europeana Data Model (EDM) distinguishes between the cultural object and its digital representation, thus allowing multiple records to exist for the same object. A number of standards are being used in the project, including Dublin Core, and SKOS. Europeana also aims to bring research libraries into the Europeana fold.
Libraries should be enabling flexible, integrated discovery and delivery frameworks. A number of recommendations and calls for action emerged from the meeting, including
- Open up as much data as possible
- Plan for RDA
- Dare to experiment - perhaps collaborating with third parties
- Focus on aggregation
- Open reusable metadata
So much progress has been made in the five years since the first London Meeting. Ongoing development and collaboration can only bring new, innovative services and products into the wider community.
Photo courtesy of tribalicious via Flickr.