Brave New World: the impact of Artificial Intelligence

Gary Horrocks reports back from UKeiG's special event focused on AI in the information sector.

<< back Page 2 of 2

Searching for meaning in text

David’s presentation segued perfectly into the Director of UXLabs and Founder of 2dSearch Dr. Tony Russell Rose’s informative and thought-provoking crash course in the fundamentals of natural language processing (NLP) - the terminology, techniques and applications - and how NLP interfaces with AI. There was, he argued, a significant overlap between the two, but NLP was a sub field of AI and closely aligned to computer science and programming.

The primary objective of NLP is to disambiguate language and search for meaning in text. It is a major growth area, a multi-faceted field of research including basic text processing, text mining, the human computer interface, language modelling and lexical semantics.

NLP research faces monumentally complex obstacles confronted with linguistic phenomena. Language is ambiguous and the key tenet of NLP is resolving that ambiguity. Tony illustrated his case in point with some amusing examples of newspaper headlines.

“Prostitutes appeal to Pope.”

“Drunk gets nine years in violin case.”

“Miners refuse to work after death.”

Tony articulated some of the linguistic dilemmas that make NLP so problematic:

  • Polysemy, where a word maps to many different concepts - e.g.: Bat (sports), Bat (small animal with wings), BAT (British American Tabaco)
  • Synonymy, where one concept maps to many different words – e.g.: Hardworking: diligent, determined, industrious, enterprising
  • Word order – e.g.: Venetian blind versus blind venetian
  • Stop word removal – e.g.: The Who, Take That, “To be or not to be”
  • Stemming – e.g.: fish, fisher, fishing
  • Parsing (analysing a string of text into logical syntactic components) – e.g.: “I saw the man on the hill with a telescope”

Tony argued that language is constantly changing. “I want to buy mobile” would have been meaningless twenty years ago, even meaningless today in the United States where “cell phone” is the popular parlance. How would you go about analysing sarcasm, irony, jargon and slang? Similarly, idiomatic language poses key problems. (He was a “dark horse.” She “threw in the towel.”) In a rapidly changing world neologisms are also prevalent. Social media alone has generated many: “Unfollow” and “retweet”, for example.

Tony went on to list some of the disciplines that are researching solutions to these problems. Computer science underpins the foundations of all this research

  • Text analytics – linguistic, analytical and predictive techniques to extract structure and meaning from unstructured documents
  • Computational linguistics – the use of computational techniques to study linguistic phenomena
  • Cognitive science – research into human information processing
  • Information science – the analysis, classification, retrieval, manipulation and dissemination of information

Tony concluded by providing numerous examples of NLP toolkits and applications including: spaCy software, TextBlob, Apache OpenNLP.

Challenges and opportunities

AI is very much a reality. It is such an all-embracing term that it includes a multiplicity of technologies and applications at various stages of development. Some innovations and technologies may take years to come to fruition, others are very much impacting on resources and services here and now. Voice recognition, virtual assistants and chatbot services are a typical example.

There are huge challenges and opportunities for the knowledge, information management and library sector. The benefits are obvious, and this year’s UKeiG Members’ Day captured just a few examples of the huge potential that AI offers in transforming digital publishing and information retrieval with the development of analytical tools that identify, extract and analyse text. Information science is a key tenet of AI and the profession is well placed to lead in developments and research in this emerging discipline

The spectre of disintermediation, the impending fear of redundancy, has always haunted the profession. Online in the 1970s, CD ROM in the 1980s, the growth of the Internet in the 1990s, seemingly threatened to displace the library and information professional, but the sector has always risen to the challenge. A key concern about AI is the lack of trust; the potential for bias based on flawed algorithms. Cynicism about Google’s search results provides a typical example of the pitfalls that lie ahead. The information professional is well-placed to question those algorithms, to identify and check bias and assure quality. The LIS workforce will be required to refine, review and evaluate AI applications and build business cases for them. AI is also an iterative technology, so will require always require human intervention and “training.”

An exciting road lies ahead.

This is an edited version of an article from UKeiG's online journal eLucidate. 

UKeiG is a special interest group of CILIP and is free to join for all CILIP members.

<< back Page 2 of 2