“Digital preservation is the set of management processes that ensure the long-term accessibility of digital information.”
Radical and rapid changes in information creation characterize the present society. For example, politicians, ministers and presidents use twitter, Facebook and other social media sources to answer questions, to make statements or to debate about current affairs. This would be relevant information to memorize.
But how do we get this transient information preserved? Or even more crucial: How can we distinguish between important and useless information? Is the information trustworthy? Each year 10 billion tweets are posted, which brings us to the next issue. Huge amounts of information have to be selectively exploited. And finally an appropriate way of saving the data has to be found. Memory institutions such as archives, museums and libraries face new challenges.
Arcomem, a EU- funded research project concentrates on these questions. It is a consortium of twelve members from seven European countries, such as Yahoo, the University Of Sheffield, athens technology center, Telecom Paristech and the Internet Memory Foundation. Arcomem aims at a more selective processing and a better organized, collective transformation of the information in terms of web archiving. For this, arcomem is developing tools for intelligent content acquisition. Archive creation for web content should become more effective and more valuable.To see how arcomem works in pictures, have a look at the descriptive video above.
But there is not only arcomem engaged in this challenging topic.
The open archival information system (OAIS) describes a framework for a complete archival system and makes it available for a designated community. Interesting fact is that OAIS was originally developed for data, which is obtained from observations in space environments and later was used for on-earth information preservation.
The Library of Congress (LOC) established a pilot project to collect and preserve websites, in which only open-source software is used. DigiBoard is a tool to allow the archivist to make a selection of the websites that he wishes to be archived. To acquire the target information, a web crawler like Heritrix is used. Web crawlers are an application that browses the internet in an methodical and ordered way. In order to access the gathered information and to provide it for the enduser, LOC uses the Wayback tool.
In 2003, LOC and ten other libraries from Europe and Canada formed the international internet preservation consortium (IIPC). They work with similar tools and aim at the same goals as the LOC’s project does.
Lastly, it is not only memory institutions, but also individuals that should take steps to preserve their digital memories. Think of your favorite blog, your music playlists (especially when you use last.fm, Spotify and similar music sources) or your media library on Instagram or Facebook including loads of beloved pictures.
Digital preservation, with the main focus on web archiving is of increasing importance. The trend to make use of cloud storage (such as Google Drive, Apple’s iCloud, Dropbox and Scribd) and other file hosting services leads to the fact that most of our documents will not be saved on our own hard disk anymore but online. Therefore we are more forced than ever to ensure a ‘real’ copy of our documents exists.