Data 'born digital'
Without doubt this is the information age - but it is not the preservation age.
According to Yaniv Levi of Ex Libris over 90% of today's data is 'born digital' and there is most likely no analogue equivalent. So over 90% of our knowledge, culture, government, business and personal records, are created and stored in digital form. This may include emails, datasets, blogs, texts, spreadsheets, databases, websites, Word or PDF documents, photographs, Tweets, online videos and so forth.
I recently volunteered to speak at a LIKE event about the 'Future of History' where I discussed the challenge of preserving knowledge in the digital age. Adrian Brown, Assistant Clerk of the Records at The Parliamentary Archives, was my guest speaker. Adrian discussed his own digital preservation experience at the House of Commons and some of the approaches used by the Parliament Archives to curate born digital assets.
Preserving 21st century data?
For centuries 'memory institutions' have safeguarded our national and cultural heritage, as a legal and moral duty. However, the UK does not have an official strategy for preserving digital assets and so responsibility for preserving much 21st century data is undefined.
Neil Beagrie and Maggie Jones define digital preservation as "the series of managed activities necessary to ensure continued access to digital materials for as long as necessary."
Essentially, the goal of digital preservation is to ensure continuous access to data that may have enduring value for the benefit of present and future generations. But digital preservation is no easy task.
Will files created today be accessible and readable in 500 or even 50 years from now? If the technology becomes obsolete or if the software or hardware (think floppy discs, punch cards, WordStar files) are outdated your data may be inaccessible, unreadable or quite simply lost.
Ultimately, there can be no doubt that some digital assets need to survive for perpetuity, whereas the technology the data is stored upon unfortunately has a short shelf life.
Digital dark age?
Digital culture is a fragile culture. There are well documented instances (Friendster, Geocities) where internet services have shut down. A recent case is Google Video. Launched in 2005, it could not compete with YouTube and so on 13th May 2011 the service was closed completely. Unless users downloaded their content to an alternative service their data would be deleted.
When such services shut down it should remind us of the vulnerable nature of digital material. What would happen to all the content on Facebook if it shut down tomorrow? Will email exist in 2045? Will Google, Gmail and Google Docs exist 50 years from now? If the services cease to exist, does the data cease to exist?
There are also social, cultural and legal challenges to address.
Should everything in the digital realm be preserved? If not, what should be preserved, and just as importantly what should not? Archival selection will determine what and whose stories get told, and whose do not.
While there may be a lot of 'junk' in the digital space, there is also undoubtedly a lot of historically and culturally significant material in digital form. Yet in an era of information overload we cannot save everything. It is therefore vital that every creator of digital data adopts a clear selection and equally clear deletion policy. Digital preservation is also about what NOT to save.