The "CASPAR" Project
Digital data – here today, gone tomorrow
© European Union
CASPAR project: An inter-sectorial project sponsored by the European Commission (EC) under the leadership of the Natural Sciences Sector for the preservation of Cultural, Scientific and Artistic digital data.
(This article is based on the article created by the European Space Agency posted at ESA’s web site on 14 August 2006. Read the ESA's article
The amount of digital data being produced across various disciplines is increasing at an exponential rate. But this information may not be around for future generations because the data is often incompatible with rapidly changing technologies and become unreadable. To address this risk, ESA is assisting a European-Union backed project for the preservation of fragile digital information.
Thanks to the excellent cooperation between ESA and UNESCO, ESA invited UNESCO to participate at a new EC sponsored project.
The large-scale project called CASPAR (Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval) will build a pioneering framework to support the end-to-end preservation lifecycle for digital information based on existing and emerging standards.
Project Co-ordinator Dr David Giaretta explains: "It is widely recognized that the digital information on which we all rely is actually remarkably fragile. Society needs to ensure that digitally encoded information can still be understood and used in the future when the software, systems and everyday knowledge will have changed. Things we take for granted now would otherwise be completely unfamiliar, something to be guessed at, even if we preserve the bits and bytes.”
"Moreover, in many currently planned and future experiments, more data will be generated than has been collected in the whole of human history."
Of particular importance is the huge breadth of users and types of digital information against which CASPAR will be tested: science (using ESA satellite data and a variety of science data from the Central Laboratory of the Research Councils (CCLRC), cultural heritage (using data from UNESCO - the United Nations Educational, Scientific and Cultural Organization) and performing arts (including data from the Institut National Audiovisuel, Groupe de Recherches Musicales (INA-GRM) and the Institut de Recherche et de Coordination Acoustique-Musique (IRCAM) - French institutions that fostered the development of electronic music).
Cultural and artistic data
Mayan city in the Yucatan Peninsula
Data to be provided by UNESCO will be centered on UNESCO inscribed sites (World Heritage sites and Biosphere reserves) the data will include: legal text, site description, historic documents, books, paper photos, slides, satellite images, maps, etc.
Nemrut dag - Turkey
Also partners of the UNESCO-ESA ‘Open Initiative on the use of space technologies to support the World Heritage Convention’ will provide sample data. For example, information will be provided on the two Buddhas carved into the cliffs of Bamiyan in Afghanistan around the third century A.D., which were destroyed in 2001. This site was inscribed, after the destruction of the Buddhas, as a World Heritage site in 2003. The cultural site Via Apia (Rome) will also be a main sample having laser scanner measurements, satellite images to model the associated cultural landscapes, virtual tours and virtual reconstruction.
Kromeriz gardens, Czech Republic
Artistic data from INA-GRM, which holds archives of French public radio and TV, and IRCAM will focus on electronic music, preserving components of scores, pieces of computer codes, instructions and documents indicating author’s motivations to preserve intelligibility or the minimal understanding necessary to be able to perform the work again.
Envisat sensing the Earth
Protecting data acquired by satellites for future generations is of utmost importance because it allows for the continuity of datasets. For instance, scientists accessing today’s climate change data in 50 years will be able to better understand and detect trends in global warming and apply this knowledge to ongoing natural phenomena.
The volume of data generated in environmental science is projected to increase radically over the next few years. ESA satellites, such as Envisat, ERS-2 and Meteosat Second Generation, are currently generating around 1 000 Gigabyte of data per day. With the upcoming launch of the new MetOp satellites, the daily data volume generated by ESA will increase at an even faster rate. ESA’s mandate is to maintain archives of data gathered from satellites for 10 years after the end of the mission. Currently ESA is using funds from various ongoing programs to maintain these historical bit streams in accessible archives.
Sustainable preservation of this information in the long term will require the logical integration of many more pieces of data and objects, such as the conditions under which the instruments were operated, the system and software environment used to gather the signal and the algorithms used for manipulating the acquisition bit stream. All this information is required systematically for all instruments and missions, in a dedicated programmatic vision.
Within the CASPAR project, selected ESA satellite data streams will be the first objects to demonstrate how the proposed preservation platform architecture can be applied to handle complex digital objects. ESA will not only provide the necessary satellite data and associated information but also the operational experience and demonstration infrastructure.
The Global Ozone Monitoring Experiment (GOME), launched onboard ERS-2 in April 1995, is set to be the first candidate. Since 1996, ESA has been delivering GOME global observations of total ozone, nitrogen dioxide and related cloud information to users via CD-ROM and the Internet.
Development and implementation
CASPAR’s work will include the development of key components and framework providing characterization – including Representation Information and Preservation Description Information, virtual storage – using advanced storage technologies, and access services – including intuitive query and browsing mechanisms, and, throughout all these, exploiting the potential of semantic web.
Throughout the project, attention will be paid to related issues such as standardization, authentication, accreditation and digital rights management, seen as critical for the operational implementation of digital preservation services. The achievements of the CASPAR project will be disseminated and promoted in worldwide communities interested in digital archiving and preservation.