Book project
Using the OAIS model for the medium and long-term preservation of oral/linguistic resources
This book is an outcome of team interaction, documentary research and design work associated with the creation of frameworks for the preservation and sharing of oral/linguistic resources based on the OAIS (Open Archival Information System) model. In France, a pilot project was initiated in 2008 by TGE-Adonis. It created a partnership between two submission sites (CRDO-Aix and CRDO-Paris) and two major computing centres: Centre informatique de l'enseignement supérieur (CINES) and Centre de calcul de l'Institut national de physique nucléaire et de physique des particules (IN2P3), operating in the legal framework of an agreement between the French Centre national de la recherche scientifique (CNRS), CINES and the French National Archive (SIAF). The full corpus of working papers and reports produced during the pilot project is available in a public archive (
http://sldr.org/ark:/87895/1.4-187408).
After the completion of this experimental phase (August 2011), CRDO-Aix (renamed Speech & Language Data Repository, SLDR) has been collaborating with Centre de Ressources Textuelles et Lexicales (CNRTL, www.cnrtl.fr) to create nodes in the CLARIN infrastructure (www.clarin.eu) combining their experience with oral and written resources.
This book will contains all the technical specifications and details of developments undertaken during the pilot project phase. Options will be discussed with no claim to being exhaustive. Further, we hope to cover a wider range of experience thanks to the contribution of research scholars and engineers who worked on similar projects.
The following issues have been identified so far:
- A multi-tier design for generic submission sites in the Open Archival Information System (OAIS) framework.
- Documented example: the implementation of a submission site in the context of a pilot project for the preservation and sharing of oral/linguistic resources (2008-2010).
- Guidelines for supporting research projects working with medium-term and long-term preservation.
- Adding value to resources: enhanced descriptive metadata for a better interoperability of resource sites.
- Legal aspects of medium-term and long-term preservation: setting up access rights management, non-commercial licences and informed consent documents in compliance with current legal obligations.
- Open questions: (a) how to promote a ‘culture’ of resource pooling in the international scientific community; (b) collaborating with institutions offering commercial distribution of the same resources; (c) preserving an archival service against the hazards of fund scarcity and blind bureaucracy.
Keywords: OAIS, archive, long-term preservation, medium-term preservation, multi-tier architecture, resource pooling, access rights, heritage code, intellectual property, copyright, metadata, interoperability, linguistic resources
