Rabbit's Bits Run

Contributor: 

(Reprinted with permission from author Matt Kirschenbaum of the BitCurator Project)

"On a drizzly Cambridge morning last May, Porter Olsen and I from the BitCurator team found ourselves in a basement room with personnel from Harvard’s Houghton Library, including digital archivist Melanie Wisner and Leslie Morris, one of the manuscripts curators. The star of the show, though, was Mr. John Updike, whose born-digital remains arrived unceremoniously in a big brown box.

Updike as much as anyone has earned the right to the title of dean of American letters in the second half of the twentieth century. He began using a Wang word processor in 1983, the very year his biographer Adam Begley identifies as the height of his literary career. After a decade or so, he moved over to IBM PCs and the word processing software associated with them, first Lotus Ami Pro and later Microsoft Word. Updike never got rid of his typewriters, or indeed his pen though: instead, the computers took their place as part of his workflow, instruments of composition for letters, short stories, and essays while novels and poems he continued to draft longhand.

The Houghton acquired the Updike papers in 2009… there are also about fifty 3.5” high-density IBM-compatible diskettes, of which 38 appear to have content on them; a half dozen 5.25” floppies which are installation disks for Lotus Ami Pro; and a dozen or so CD-ROMs. There are no hard drives or complete computers, nor do any of the diskettes from the Wang era appear to have survived.

But the born-digital materials that do survive as part of the collection are part of the author’s manuscript record, and until there is a sustained scholarly investigation of them we cannot know what, if anything, they might contain in the way of drafts and other materials that could shed light on some aspect of Updike’s literary career. For me this was an opportunity to get a firsthand look at the digital life of a writer I was also researching for my book on the literary history of word processing; for the BitCurator team, it was an opportunity to work with cultural heritage materials of paramount importance.

The Houghton had prepared for us a Mac Mini with 16 GB of RAM and Virtual Box and the BitCurator virtual machine preinstalled, as well as a USB 3.5” drive suitable for the high-density IBM-formatted diskettes we knew we would be working with. Over the course of several hours (that included discussion and instruction as well) we imaged a dozen of the disks without incident; one initially manifested bad sectors but corrected itself after a repeat of the imaging process. For each image we did a quick, initial inspection using bulk_extractor and the BitCurator reporting tools.

There was no smoking gun “LostNovel. doc”, nor had I really expected to find one. But there was a palpable sense of accomplishment in the room, as the librarians present realized that this is doable. As Porter and Melanie worked hands-on with the diskettes, conversations sparked around issues like file naming conventions, directory structures, and what to represent in a finding aid, as well as, of course, strategies for researcher access. All of this was very gratifying to see. For my part, I did a preliminary sort and arrangement based on the disks’ labels, and then manually write-protected each disk using its plastic slider mechanism. Sitting in a basement room of the Houghton and manually writeprotecting John Updike’s computer disks was not something I ever anticipated doing in a scholarly career!

My prior experience with the papers taught me that Updike was frugal, and reused all manner of material, typing or writing on the backs of drafts, or even envelopes and receipts and other people’s correspondence. Certainly his digital working habits appear consistent. His practice was apparently to store multiple versions of a file on the disk, overwriting previous ones with new ones and notating the date on the disk’s label after crossing out what was written previously. For some novels, like Villages, Terrorist, and the Widows of Eastwick (sequel to the more famous Witches, which was done before his word processing days) there are multiple diskettes with relevant material; others contain several dozen shorter pieces such as stories or reviews. One is marked C:\ FAMILY\JOHN and would seem to contain personal material. Of course we do not know what he may or may not have happened on his hard drives. Moreover, he was in the habit of producing multiple hard copy typescripts in the course of working on a book, and then annotating and revising these by hand. For some lucky researcher, there will be an interesting challenge in triangulating between a.) the hard copy typescripts or print-outs in the physical papers, which are usually dated, b.) the dates on the labels of the diskettes, and c.) the modified, accessed and changed/ created (MAC) times on the diskettes and the contents of their digital files, including deleted TMP files. Whether major insights into Updike’s creative life are thus revealed or not, this is paradigmatic of what literary textual scholarship is going to look like in the coming years.

The Houghton staff indicated that they need to think through the file management and policy issues, but they are prepared to move ahead rapidly with the imaging and wish to be responsive to future requests from researchers."