Session 6: Tools and Demos Showcase
Ethan Gates, Yale University Library; Sally DeBauche, Stanford University; Gregory Wiedeman, University at Albany, SUNY; Brian Dietz, NC State University Libraries | BitCurator Consortium
What Would You QEMU?
Ethan Gates, Yale University Library
QEMU (Quick EMUlator) is a powerful open-source program for emulation and virtual machine management. In this lightning talk, I will discuss its potential application in digital archive and curation workflows, including: running BitCurator in a virtual machine; running Windows and/or MacOS virtual machines inside BitCurator; running legacy software systems and applications for use in file assessment and migration; and as a delivery vehicle for access to born-digital materials on “incompatible” systems. I will summarize this utility with a brief case study, in which I worked with another Yale colleague to assess the contents of a Windows 98-era file using a stand-alone QEMU-based virtual machine that could be shared and run on any Windows 10 machine, no administrative access or installation necessary. Ultimately it would be my hope to encourage more familiarity with QEMU in the BitCurator Consortium, and draw attention to the need for more, accessible documentation and a community of practice around this tool’s use in digital preservation workflows.
Designing and Launching a Multi-Institutional Email Metadata Discovery Platform
Sally DeBauche, Stanford University
ePADD is open source software developed by Stanford Libraries, Harvard Library, and the University of Manchester Libraries to support the appraisal, processing, delivery, and discovery of email collections of historical value. This lightning talk will discuss the motivation behind and process for developing a multi-institutional version of ePADD’s Discovery Module, as well as give information about how ePADD users can contribute their collection metadata to the site.
ePADD’s new Shared Discovery Module allows institutions to publish a redacted view of their processed collections to a central discovery platform. This redacted version displays automatically extracted and user generated metadata about collections, including collection summaries, extracted entities, correspondents, and redacted email header information. The Shared ePADD Discovery Module also allows researchers to browse and search across metadata of email collections held at different repositories. We believe that this functionality will allow users to uncover connections between collections.
Adapting the former ePADD Discovery Module to facilitate the display of email collection metadata from other institutions was one of the primary goals for our Andrew W. Mellon Foundation funded grant project that ran from 2020-2021.
Using a Common Specification to Preserve Email
Gregory Wiedeman, University at Albany, SUNY
The Mailbag Project is developing a specification for packaging email in multiple formats. This lightning talk will discuss the benefits of functional specifications for digital curation and preservation and the development process for the Mailbag Specification. Much of archival work is governed by systems of various complexity which both empower and constrain archivists. Functional specifications provide the opportunity for archivists and other information professionals to have a greater voice in how these systems work. I’ll discuss the process for writing the Mailbag Specification, including what worked well and what should be improved. This will focus on how voices from the project’s advisory board and the wider community informed the major decision points and helped promote simplicity in the specification, which should enable a broader variety of systems and workflows that can utilize it at different resource levels.
Lightweight Distributions of Tools
Brian Dietz, NC State University Libraries
The BitCurator Consortium distributes a customized variant of Ubuntu that pre-packages tools useful to digital archivists for capturing and analyzing born-digital materials. This approach has proven extremely successful, with the BitCurator environment being an established toolset for digital archivists. However, there are limitations to relying on the pre-built VM or using the image for a full or partial boot. In this lightning talk, I will discuss Homebrew (Mac and Linux) and Docker (any OS) as two possible alternative, lighter weight methods for distributing digital archival tools, addressing the pros and cons of each approach. Ultimately, rather than competing with the BitCurator environment, these methods could be other distributions the BCC manages.
Slides TranscriptEthan Gates, Yale University Library; Sally DeBauche, Stanford University; Gregory Wiedeman, University at Albany, SUNY; Brian Dietz, NC State University Libraries. (November 17, 2021). Session 6: Tools and Demos Showcase. BitCurator Consortium.