Archives New Zealand’s tools for digital preservation come from a variety of sources and where possible are open-source and/or community driven, proven, and best-of-breed. The mix of tools includes:
The Archway archival management and catalogue system, proven in-house for paper records
The Rosetta digital preservation system (acquired jointly with the National Library, and used globally in numerous libraries, archives, and museums)
Open-source digital preservation tools, such as DROID, JHOVE, and NLNZ Metadata Extractor.
For now, this combination of tools satisfies the minimum requirements for undertaking digital transfers. This current mix may not cater for all eventualities and will evolve with Archives New Zealand hopefully paying it forward into the global mix.
By talking a little bit about the tools that we use, we hope it will help organisations to understand more about the context around digital transfer, and may also help organisations to understand what might work for them.
Archway
Archway is an archival management system developed in 2008 to enable Archives New Zealand to document government records in the context of their creation and use. Archway contains descriptions of records that have been transferred from government organisations to our four offices in Auckland, Christchurch, Dunedin and Wellington. There is also a wealth of information in Archway about the government of New Zealand from 1840 to the current day. Detailed histories of government departments, the functions they performed and the types of records they created provide an essential background for locating records, and understanding their content and purpose.
Archway is an implementation of the Australian Series System for describing archives in context. It separately documents seven core entities and the key relationships (of control and succession) that exist between them. These core entities are: record, series, authority, agency, organisation, jurisdiction, and function.
Archway is used by Archives New Zealand to provide access to the accessioned born-digital records. Whether archives are physical or digital, Archway is the gateway to public archives.
Rosetta
Rosetta is the long-term preservation system used by both Archives New Zealand (for the Government Digital Archive) and the National Library of New Zealand (for the National Digital Heritage Archive). It is designed to enable effective preservation of, and access to, digital archives and heritage collections.
With the Rosetta system, large amounts of digital data, including audio, video, and textual information, can be stored and managed, ensuring the preservation of donated or transferred digital content. Rosetta provides various tools for management of digital collections and digital content. It is based on the OAIS (Open Archival Information System) model.
The main purpose of the system is to provide long-term preservation for digital objects. Rosetta is not a content management system, nor a digital library, nor a digital asset management system. Instead its purpose is to enable active preservation (i.e. not only bit-stream preservation, but also preservation of the intellectual content of files independently from its format); metadata extraction and storage; risk management; and the support of preservation planning and preservation actions.
One of the reasons to choose an off-the-shelf and well established solution was to benefit from an existing roadmap for the system development, including an existing community of users. We are able to influence system evolution and the development solutions through the various Rosetta user groups.
DROID, JHOVE, & NLNZ Metadata Extractor
The other tools used at Archives New Zealand are DROID, JHOVE, and the NLNZ Metadata Extractor. These are used independently of Rosetta as part of a digital transfer process and then inside Rosetta, forming a gateway into the system, enabling the management of digital records.
DROID is a file format identification freeware created by The National Archives (TNA). DROID is used to tackle the first stage of the transfer, helping Archives New Zealand to understand what a public office has.
DROID relies on the technical registry called PRONOM[1] file formats register / database which is enriched regularly. Signature files are generated by PRONOM and used by DROID for file format identification.
If a record doesn’t have an identifiable file format inside the PRONOM database, then it is a clue we might need to research it further – the agency may not know what it is, or be able to access it themselves! The result of this research will either be a more nuanced sentencing decision between Archives New Zealand and the public office, or the generation of a new record in the PRONOM database which will help us, and worldwide organisations, to identify similar records in future.
JHOVE, a widely-used open source format validation tool, is used in conjunction with DROID at Archives New Zealand. JHOVE checks if the file format corresponds to the format specification. If not, then there is a chance that the file can no longer be read or is broken in some way (malformed for example). Another impact is that metadata extraction cannot be completed and the file will require further analysis inside Rosetta.
“JHOVE only reports full conformance to a profile, that is, it focuses on the semantics of a file rather than its content: a file which is well-formed but not valid has errors.”[2]
NLNZ Metadata Extractor was developed by The National Library of New Zealand. It allows the Rosetta digital preservation system to extract metadata from various file formats. An example is the extraction of Author and Description fields from word processed documents. The tool can also extract information such as Artist and Song Name from audio files, and various metadata from web-archives.
If a file is not-well formed, or valid but it hasn’t been picked up or identified by JHOVE, then there is a chance it will then fail the NLNZ Metadata Extractor. Should this occur, then it will also require further analysis inside the Rosetta system.
The use of these tools together is quite common in the digital preservation community to identify and validate file formats for long-term preservation. While JHOVE provides a more robust and greater metadata output than DROID, it handles a much smaller range of standard-based file formats than DROID. NLNZ Metadata Extractor handles some trickier formats, like those from Microsoft which do not have open specifications, so using each tool to assess records that are candidates for transfer offers a good comparison mechanism.
While it is not impossible that a file that fails any of these tests will end up in the government digital archive, it does create a point in time where both Archives New Zealand and the public office can reflect on how we create, and maintain, the public records for which we are responsible. It also presents an opportunity to improve the community’s knowledge in this area.
For more information, please have a look at this blog from the Open Preservation Foundation which looks at this very subject from a global perspective: http://openpreservation.org/blog/2016/03/13/what-is-the-point-the-motivation-for-adopting-different-tools-inside-the-digitalpreservation-workflow/
If you have any questions or comments please get in touch through rkadvice@dia.govt.nz
[1] PRONOM is an on-line information system about data file formats and their supporting software products. Originally developed to support the accession and long-term preservation of electronic records held by the National Archives, PRONOM is now being made available as a resource for anyone requiring access to this type of information.