Big Data and Records Management

John Davies discusses the role of records managers in making sense of Big Data - and explores five Big Data myths.

Page 1 of 4 next >>

What is Big Data?

Big Data is a recently hatched consultancy theme that refers to our ability to bring together vast collections of data, to analyse them quickly, and draw potentially novel conclusions from them.  Now, working with big data sets is not new and some sectors like pharmaceuticals and finance have experience gained over decades.  However, the language used to describe the opportunities offered by new Big Data is challenging and aspirational:  transformational, revolutionary, will deliver epiphanies, will render policy makers redundant, and will change the way we think about all aspects of our daily life.  In the week when we’ve had the Edward Snowden story breaking across the media, it’s important to remember that with such unfettered power comes a threat to privacy, identity, and freedom of action.  Big Data requires Big Judgment.

Myth 1: This is something new

What’s new is not the Big or the Data but certainly the analytical tools which have become more sophisticated and accessible in recent years.  As for the rest, the term 'information explosion' first occurs in 1941 according to OED and it is no coincidence that in the same year, Emmett Leahy and colleagues in the US navy began work on the first Records Management manuals for use in administration across the Pacific theatres of war.  Classic records management is, after all, a set of administrative processes designed to manage large quantities of documentation.

Big data appears as a term in use in 1997 in the proceedings of the IEEE 8th conference on Visualization. Michael Cox and David Ellsworth start their article with a still useful and, indeed, prophetic definition:

Visualization provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data. When data sets do not fit in main memory (in core), or when they do not fit even on local disk, the most common solution is to acquire more resources. 

Myth 2:  Big means big

Big, meantime, acquired some popularity and notoriety as the adjective of choice among politicians.  The Big Bang (1986) marked the liberalisation of rules in the finance sector and prompted the growth of London as the world’s financial hub; the Big Conversation (2003) was Tony Blair’s attempt to set out a fairness agenda while David Cameron’s Big Society (2010) is still trying to get citizens to pay for things instead of the Welfare State.

In this world, Big is vaguely described and implies something that is better than what came before (when the other party was in power).

Page 1 of 4 next >>