Grey literature and research data

The theme of the 20th International Conference on Grey Literature was 'Research Data Fuels and Sustains Grey Literature'.

The twentieth International Conference on Grey Literature, organised by TextRelease ( and held this year on the campus of Loyola University in New Orleans, December 3-4, 2018, had the theme of Research Data Fuels and Sustains Grey Literature.

Unlike larger conferences, this one is more of a boutique event. Only 65 people attended, 15 of them from outside the United States, and almost every attendee was involved with the conference, presenting either a paper or a poster. Although this one is billed as the 20th, it’s actually older than 20 years. TextRelease’s first grey literature conference was held in December 1993 in Amsterdam and the second in Washington DC in 1995. It has toggled between the U.S. and Europe ever since.

In his welcoming address, Brian Hitson, Director of the Office of Science and Technical Information, OSTI-DOE, made the connection between open science and grey literature. Although open science evolved from open access, it is more than the text of a paper and should include code, visualisations, and data. Grey research objects should be characterised as part of grey literature such as patents and technical reports. He recommended that the research community obtain DOIs for data and for software citations, suggesting DOECODE ( Without the DOIs, Google won’t surface the research, particularly now that it’s introduced a dataset search engine. Hitson wondered out loud if artificial intelligence will replace DOIs and speculated that AI, particularly machine learning (ML), will predict where science is going.

Vivian Hutchison, Acting USGS Library Director, U.S. Geological Survey, in her keynote address, discussed some of the myriad types of earth and biological sciences data collected by the agency. It was particularly fascinating to watch time lapse photography of the Kilauea volcano eruption. USGS is firmly on board with open data, but creating a public access plan requires new workflow routines for scientists, a seismic culture shift. The agency maintains a data management website ( and uses the ScienceBase metadata wizard ( Hutchison pointed also to the Science Data Catalog ( and urged that scientists be given credit for data, not just publication.

Can grey to too grey?

Dobrica Savic, from the Nuclear Information Section, International Atomic Energy Agency in Austria, asked whether is was possible for grey to be too grey. The European Union supports open government data, Russia defines open data to the extent of specifying data formats, and China is opening up its science data. Unstructured data from emails, PowerPoints, survey responses, and audiovisual files is becoming more prevalent, while adding metadata tags gives a semi-structure to them. Unmanaged data can be risky, leading to security breaches and privacy invasions. Like Hitson, Savic considered the future role of AI. Learning from real data to create datasets for training purposes is important, as is augmenting ML algorithms when real data is too expensive to collect, inaccessible, or incomplete.

This year’s grey literature award winner, University of Wisconsin’s Tomas Lipinski, and his LAC colleague Kathleen Henderson, expanded on the legal issues surrounding the collection, use, and access to grey data. One issue is the possibility of a mix of copyrightable and non-copyrightable materials in a dataset. Daniel C. Mack, University of Maryland, presented a model for fulfilling legal and policy requirements to ensure compliant research data.

In 2011, reported Dominic Farace and Jerry Frantzen of GreyNet International, The Netherlands, and Joachim Schöpfel of University of Lille, France, GreyNet began an Enhanced Publication Project (EPP). Since then, EPP has been embedded into the workflow of GreyNet. A 2018 survey shows an increased willingness of authors to share their research data. Most recently, its Data Papers Project formed the basis for a training module for making the data paper a trusted tool in research and data sharing.

A panel discussion, moderated by Hitson, featured short presentations from Dobrica Savic, Lorrie Johnson (OSTI), Abe Lederman (Deep Web Technologies, Inc.), and Justin Fessler (IBM). Savic explained the basics of and how this global alliance supports open science, while Johnson concentrated on what makes the alliance unique. Both stressed its multilingual federated search capabilities, which Lederman expanded on in his presentation. Fessler added to the discussion by describing how IBM’s Watson’s cognitive search works with unstructured content, identifying context and applying predictive analytics. At OSTI, Watson is being used to index audio files.

Of the 15 posters on display, several amplified talks given during the conference. Lederman showed off GreyHub, which he describes as a “discovery service for grey literature” and Johnson’s was a visual representation of Other posters explored new grounds. Margret Plank, TIB Hannover, Germany, outlined the many services of the German National Library of Science and Technology, including supplying information, research data management, and supporting OA publishing. The poster showed TIB’s open knowledge research graph. Using data to tell a story was the topic for Julia Gelfand, University of California, Irvine, and Anthony Lin, Irvine Valley College. For those who see science as the only topic being discussed as grey literature, Robert E. Noel, Indiana University Libraries, had another take. His poster considered documentary film, television series, and investigative journalism as grey literature.

The combination of formal papers and poster sessions contributed to a lively, albeit small, conference. Video recordings for some of the sessions are now online ( GL21 will be held at the German National Library of Science and Technology, Hannover, Germany, October 22-23, 2019.
Marydee Ojala ( is the editor-in-chief of Online Searcher and co-chair of the Internet Librarian International conference.