AI Tools for Digital Libraries: Enhancing User Experience and Trust

The Czech Digital Library has been testing AI-powered features in its interface since 2023 and is now the national aggregator of 15 of the largest Czech digital libraries.

, , , ,

As generative AI (GenAI) and semantic technologies rapidly enter the public sphere, digital libraries are increasingly expected to offer intuitive LLM-based functions and interfaces alongside traditional tools. Large language models (LLMs), multimodal AI and semantic search can transform user-facing services in digital libraries. Crucial design decisions determine their usefulness, reliability and trustworthiness.

Czech Digital Library

In the Czech Republic, most digital libraries are based on the open-source Kramerius digital library system. A wide array of digital libraries exists, established by large libraries directly under the Ministry of Culture, such as the National Library and the Moravian Library, as well as specialized libraries, university libraries, regional libraries, and even some smaller institutions. As there are almost 50 installations of Kramerius, the landscape is quite fragmented.

The Czech Digital Library  was conceived as a common index and user centric front-end with the aim to provide one point of entry to the users. Currently, the Czech Digital Library provides access to 350,000 documents represented by 150 million pages (82% are copyright protected). Since 2019, it has also served as the official national aggregator for modern library documents, forwarding data to the Europeana Digital Library .

Using LLMs in Czech Digital Library

For over a year, the Czech Digital Library has enhanced its public domain digital content by integrating external AI services for translation, page summaries and text-to-speech functionalities. These AI features are available to users for a single page or a selected portion of it. They function for publicly accessible documents, both for scanned documents available in JPG/JPG2000 formats and for born-digital documents accessible as PDFs. In order to use AI services, user must be logged in as registered user of one of the Czech libraries or at least with a Gmail account.

The services allow the use of the OCR text layer to translate documents into more than 10 languages, enable text-to-speech functionality and summarize page content into a few quickly digestible points. Although these features are relatively limited, they provide tangible benefits by enabling users to work with documents in languages they do not understand, quickly analyse document content, or absorb information by listening, particularly benefiting users with special needs.

We are currently testing a LLM interface for querying either a page or a document. The testing phase includes enhanced features that allow users not only to summarize a page but also to query the AI with open-ended questions about the content of the displayed page. Additionally, the testing involves summarizing entire documents, such as monographs, articles, or newspaper issues, and querying these entire documents. Although this option is not available in the production environment, for testing purposes it is possible to easily switch between different external AI services and their models from the user interface to observe varying responses.

For querying, we have been testing a range of models from OpenAI, Anthropic and Google to get a sense of how different (and differently priced) models respond to user queries. For translations, Google Translate and DeepL have been tested first but as DeepL does not support Latin we decided to use only Google Translate, even though there are some disadvantages there as well. Early in testing, we discovered that creating summaries in a language different from the original is more effective when the document is first translated. This approach prevents models from inadvertently switching back to the original language partway through the summary, ensuring consistency and accuracy. For text to speech, we have tested services from Google, OpenAI and ElevenLabs. As each of these services has its own advantages and disadvantages, we are allowing the user to pick a model and a voice for each target language.

Currently, all these services are implemented solely in the digital library front-end to accelerate user testing and interface improvements. They have not yet been integrated into backend systems where some of these features might ultimately belong. Since all significant online AI services require payment for extensive use, we require user authentication and route all AI service requests through a common proxy, monitoring token usage and setting usage limits. This gives us valuable data on the real use of the AI services and protects us from the numerous LLM crawlers that ignore the robots.txt settings.

Newspaper Memory Project

For a newspaper memory project, we indexed 25 newspaper titles dating from 1880 to 1914, totalling approximately 500,000 pages. Due to the absence of precise page segmentation data, we divided the text into approximately 10 million chunks using segmentation heuristics. We generated vector representations for each chunk. Users could then ask questions using natural language. We implemented RAG (Retrieval-Augmented Generation) to ensure that the LLM gives consistent answers based on the most relevant texts, including clickable references to the original scanned pages.

Discover a Book Using Semantic Metadata Search

Another experiment involved bibliographic data aggregated by the Moravian Library for its Central Portal for Libraries. We enriched MARC records of monographs with publisher and additional annotations, testing the effectiveness of natural language searches within library catalogues. Early testing suggests the need to create relevant AI summaries for books that have already been digitized but lack any annotation.

A poster describing the Digital library is here

Hybrid interface

Our experiments, as well as other considerations, led us to the decision to develop a new version of the Czech Digital Library front-end. Our aim is to balance the precision of keyword search with the flexibility of semantic understanding to create a single hybrid interface. This interface will also enable image search by textual description as well as image similarity search. To make the digital library content more accessible we intend to identify named entities and pre-generate document summaries for all documents. The biggest challenge will probably be to apply retrieval augmented search properly. Our current experience suggests that RAG style querying returns best results when applied to a consistent group of relevant documents (be it newspapers, research articles, fiction, etc.), not the whole digital library content. Also, including a conversation style interface will likely improve user experience.

During development we had to take into consideration legal limitations on copyrighted materials. Apart from direct access, this limits translations of documents to other languages as well as size of quotations or whether we can use external language models or just locally running models.

Our next steps will include detailed testing of the newly developed user interface, which is now almost finished, as well as the integration of advanced AI functionalities. Simultaneously, we aim to gather usage data to support the effective integration of these AI features into digital libraries and catalogues.

Martin Lhotak (lhotak@knav.cz) is Deputy Director, Library of the Academy of Sciences of the Czech Republic; Petr Zabicka is Associate Director, Moravian Library; Filip Kersch is Manager, Digital Library of CAS; Fillip Jebway is Manager, OSDD, Moravian Lilbrary; Jan Rychtar is CEO, Trinera s.r.o.