The Other BCC: Appraising and Processing Email
Cal Lee, Kam Woods | BitCurator Consortium
Once libraries, archives and museums (LAMs) have established general processes for born-digital materials, they are often confronted with challenges associated with specific file types. The BitCurator environment has long included tools for handling specific data types, including readpst for email stored in PST format. However, the facilities for processing email have been limited. We propose a four-hour interactive session about use of open-source software to appraise and process email, focusing on two sets of software products of the Review, Appraisal and Triage of Mail (RATOM) project. The first set efficiently extracts data and metadata from email (mbox or OST/PST), identifies named entities and writes results to simple database structures for future processing, including application of machine learning. The second set of tools supports querying, browsing and tagging email based on a variety of criteria, including record status and potential sensitivities. Participants will run all tools using a web browser, so no software installation will be required. We’ll discuss how these tasks can fit into workflows for digital materials in combination with other software, including the BitCurator environment, as well as implications for participants in their institutions.
By the end of this workshop, participants should:
– Understand challenges and opportunities of appraising and processing email.
– Know where to find and how to install the RATOM tools.
– Be able to perform the primary tasks supported by the RATOM tool set.
– Understand how processing of email fits into digital curation workflows, in combination with other tools, including the BitCurator environment.
Cal Lee, Kam Woods. (October 16, 2020). The Other BCC: Appraising and Processing Email. BitCurator Consortium.