Posts Tagged ‘electronic records

22
Dec
11

Musings: SAA, DAS, and “Managing Electronic Records in Archives & Special Collections”

This afternoon I successfully completed the electronic exam for “Managing Electronic Records in Archives & Special Collections,” a workshop presented as part of SAA‘s Digital Archives Specialist program. With my new certificate of continuing education in hand, I wonder how much I should/could participate in the DAS program. I have been watching the development of the program with great interest, particularly the cost, expected completion timeline, and who the experts would be. I signed up for the course and ventured up to Pasadena for a two-day workshop with Seth Shaw and Nancy Deromedi.

Erica Boudreau has a good summary of the workshop as taught by Tim Pyatt and Michael Shallcross on her blog, so I will try not to repeat too much here. Of interest to those looking to learn more about e-recs is the Bibliography and the pre-readings, which consisted of several pieces from the SAA Campus Case Studies website. We were asked to read Case 2, “Defining and Formalizing a Procedure for Archiving the Digital Version of the Schedule of Classes at the University of Michigan” by Nancy Deromedi, and Case 13, “On the Development of the University of Michigan Web Archives: Archival Principles and Strategies” by Michael Shallcross, as well as “Guarding the Guards: Archiving the Electronic Records of Hypertext Author Michael Joyce” by Catherine Stollar.

On the first day, the instructors discussed electronic “recordness,” authenticity/trust, the OAIS and PREMIS models, advocacy, and challenges, and reserved time for the participants to break into groups to discuss the three case studies. On the second day, we dove into more practical application of e-records programs, in particular a range of workflows. One of the takeaway messages was simply to focus on doing something, not waiting for some comprehensive solution that can handle every variety of e-record. Seth displayed a Venn diagram he revealed at SAA this year, which separates “fast,” “good,” and “cheap” into three bubbles — each can overlap with one other focus area, but not both. That is, for example, that your workflow can be cheap and good, but not fast; good and fast but not cheap, et cetera.

Seth and Nancy illustrated a multi-step workflow using a checksum creator (example used was MD5sums), Duke DataAccessioner for migration, checksums, as well as plugins for Jhove and Droid, WinDirStat for visual analysis of file contents, and FTKimager for forensics. They also discussed Archivematica for ingest and description, which still seems buggy, and web archiving using tools such as ArchiveIt, the CDL’s Web Archiving Service, and HTTrack. Perhaps the most significant thing I learned was about the use of digital forensics programs like FTKimager, as well as the concept of a forensic write blocker, which essentially prevents files on a disk/USB from being changed during transfer. Digital forensics helps us to see hidden and deleted files, which can help us provide a service to records creators — recovering what was thought lost — and creating a disk image to emulate the original disk environment. Also shared: Peter Chan at Stanford put up a great demo of how to process born digital materials using AccessData FTK on YouTube.  It was helpful to see these tools I have been reading about actually demonstrated.

Our cohort briefly discussed UC Irvine’s “virtual reading room,” which is essentially a way for researchers to access born-digital content in a reading room environment using DSpace, through a combination of an application process and limited user access period. Our rules of use are also posted. I have a lot of thoughts in my mind about how this may change or improve over time as we continue to receive and process born-digital papers and records — when we are doing less arrangement and better summarization/contextualization/description, how can we create a space for researchers to access material with undetermined copyright status? What will the “reading room” look like in the future?

Our digital projects specialist and I attended the workshop and I think we found some potential services and programs that could help us with our born-digital records workflow. Above all, it was helpful to see and hear about the tools being developed and get experienced perspectives on what has been working at Duke and Michigan. I enjoyed the review of familiar concepts as well as demonstrations of unfamiliar tools, and could see myself enrolling in future DAS courses. The certificate program includes an option to test out of the four Foundational courses, at $35 a pop. If I choose to complete the program, it must be done within 2 years, with a comprehensive exam ($100) that must be completed within 5 months after completing the required courses. Some people are cherry-picking from the curriculum, choosing only courses that are the most relevant to their work. I think a DAS certification could help train and employ future digital archivists (or, in my mind, archivists in general — since we’ll all be doing this type of work) and may create a “rising tide lifts all ships” type of situation in our profession. While there is a risk of a certification craze meant for financial gain of the organization, I was grateful to learn from experienced archivists in a structured setting. There’s something to be said for standards in education in our profession. I hope that DAS will raise the standard for (digital) archivists.

31
Aug
11

SAA Day 1: Collecting Repositories and E-Records Workshop

On Tuesday, I arrived in rainy Chicago and headed straight for the Hotel Palomar for the AIMS Project (“Born-Digital Collections: An Inter-Institutional Model for Stewardship”) workshop regarding born-digital archival material in collecting repositories. The free workshop, called “CREW: Collecting Repositories and E-Records Workshop,” included archivists and technologists from around the world to discuss issues related to collection development, accessioning, appraisal, arrangement and description, and discovery and access of born-digital archival materials.

The workshop program started with Glynn Edwards of Stanford and Gretchen Gueguen of UVa, who discussed collection development of born-digital records. The speakers suggested that both collection development policies and donor agreements should have clear language about born-digital material, including asking donors to contribute metadata to electronic records from his/her collection. The challenge, they note, is in collaboratively developing sound guidelines and policies to help archivists/curators make decisions about what to acquire. A group discussion about talking to donors about their personal digital lives and creating a “digital will,” both of which help provide important information about an individual’s work, communication, and history of using technologies.

Kevin Glick and Mark Matienzo from Yale and Seth Shaw from Duke discussed accessioning, the process through which a repository gains control over records and gathers information that informs other functions in the archival workflow. While many of the procedures for accessioning born-digital material is the same for analog material, the speakers distinguished accessioning the records from accessioning the media themselves (ie the Word document versus the floppy disk on which it is saved). Mark described his process of “re-accessioning” material through a forensic (or bit-level) disk imaging process, whereby he write-protected accessioned files to protect data from manipulation. He used FTK imager to create a media log with unique identifiers and physical/logical characteristics of the media, followed by BagIt to create packages with high level info about accessions. Seth discussed Duke’s DataAccessioner program, which he created as an easy way for archivists to migrate and identify data from disks. A group discussion asked: what level of control is necessary for collections containing electronic records at your institution? and, what are the most common barriers to accessioning electronic records, and how would they show up? Our table agreed that barriers include staffing (skills and time); being able to read media; software AND hardware; storage limits; and greater need for students/interns.

Simon Wilson from Hull, Peter Chan from Stanford, and Gabriela Redwine from the Harry Ransom Center at UT Austin discussed arrangement and description. They questioned whether archivists can appraise digital material without knowing content therein, which conflicts with the high-level, minimal processing emphasized in our field in the past few years. Another major issue is with volume: space is cheap, but does that mean archivists shouldn’t appraise? It isn’t practical to describe every item, but how will archivists know what is sensitive or restricted? Hypatia provides an easy-to-use interface that allows drag-and-drop for easy intellectual organization of e-records, as well as the ability to add rights and permissions information. Peter Chan described a complex method for using a combination of AccessData FTK in combination with TransitSolution and Oxygen to compare checksums, find duplicate records, and do a “pattern search” for sensitive terms and numbers (such as social security numbers). Gabi Redwine explored her work with a hybrid collection (analog and digital records) where she learned that descriptive standards should be a learning process for staff, not students or volunteers. Her finding aids for the collection included hyperlinks to electronic content and she advocated for disk imaging. The group discussion following this session was intense! The hotbed topic was: are professional skills of appraisal, arrangement, description still relevant for born digital materials? Our group agreed that appraisal and description remain important; however, we were strongly divided about whether archivists will need to contribute to arrangement of e-records. I believe that arrangement becomes less important as things become more searchable, as argued in David Weinberger’s Everything is Miscellaneous. Arrangement emerged before the digital realm as a way for archivists and librarians to contextualize and organize material based on topics/subjects; however, with better description, users can create their own ways of organizing e-records!

Finally, Gretchen Gueguen (UVa) and Erin O’Meara of UNC Chapel Hill discussed discovery and access. Our goals as archivists include to preserve original format and order as much as possible, and apply restrictions as necessary, while balancing this with our mission to make things accessible and available. Gretchen suggested the idea of Google Books’ “snippet” idea as a way to provide access without compromising privacy or restrictions on sensitive material. Her models for access for digital material include: in-person versus not; authenticated versus not; physical versus online access; and dynamic versus static. Erin described her use of Curator’s Workbenchwithin FOXML and Solr to control access permissions and assign restrictions and roles to e-records. Another group discussion included chewy scenarios for dealing with born-digital materials; my table had to consider: “you are at a large public academic research library; director brings several CDROMs, Zip disks and floppy disks of famous (secretive) professor from campus; they are backup files created over the years; office has more paper files; professor and his laptop are missing; no one can give further details on files; write 1 page plan for preserving/describing files; working institutional repository exists.” With no donor agreement and an understanding that the faculty member was very private, we couldn’t go ahead with full access of the material.

At the end of the day, I left with a much better grasp of how I see myself as an archivist dealing with born-digital material (primarily those on optical and disk media). It seems that item-level description works best for born-digital while aggregate description works best for analog materials. Digital records are dealt with best through collaboratively-created policies and procedures for acquiring, processing, and describing them. Great stuff!

Here is the suggested reading list to help participants prepare for the course:

(Cross posted to ZSR Professional Development blog.)

*Update: all of the workshop presentations have been posted to the born digital archives blog.

11
Oct
09

NCLA Part 1: Politician papers and the new North Carolina Gazetteer

I am back in Winston-Salem, pleasantly surprised by my first experience with a state library conference: NCLA. I was warned that registrations were lower than ever, and while attendance was indeed low, I found that some sessions were more seminars than panels (which is always a better learning environment for me).

I attended the Government Resources Section’s session on politician papers in libraries, with Betty Carter from UNCG and Tim West of UNC Chapel Hill.

UNCG was given permission to acquire the papers of Senator Kay Hagan, and also has the papers of Congressman Howard Coble. While their collection’s strengths lie primarily with performing arts and early 20th century authors, UNCG’s University Archives and Manuscripts department also has political papers. Betty Carter mentioned two important things to consider when acquiring political papers: size and research potential. She also mentioned the usefulness of SAA’s publication entitled Managing Congressional Collections.

 Tim West from the Southern Historical Collection represents a large special collections repository. He mentioned the importance of obtaining special funding for a processing archivist, which the SHC has done successfully by asking for funding from donors. Research value (through archival appraisal) for historians, journalists, community activists, undergraduates, relatives, and constituents is of utmost importance to the SHC. Mr. West mentioned the importance of collecting from individuals and groups of “exceptional impact” such as officeholders who have been influential outside of political activity, people involved in politics who did not hold public office, political journalists, and more.

During the ensuing discussion, the panelists agreed that there is a need for a statewide documentation strategy for political papers. I am concerned with the role of academic special collections departments in making available political papers to the public. Academic libraries focus on students and faculty. What role do public libraries play in this? We recently de-accessioned and donated to the State Archives the papers of a local state representative because we felt they would be researched more frequently there. I had not thought that academic libraries with ties to political figures might also collect these types of work — what about the State Archives as a repository for government documents? Perhaps election materials and personal papers do not fall within their collection development policy? Also, what about elecronic records? Neither have, so far, begun collecting born-digital resources.

Another issue that became highlighted during the panel: the majority of those participating were government documents librarians, most of whom had never dealt with manuscripts. It was interesting to watch librarians and archivists discuss archival concepts — and it made me realize how much further we have to go to understand each other and our methods in dealing with “records.”

Later that afternoon, I helped introduce Michael Hill, supervisor of the Research Division of the NC Office of Archives & History and also coordinator of the North Carolina State Highway Historical Marker Program. His presentation on editing William Powell’s North Carolina Gazetteer was engaging and amusing, exploring some of the origins of unique place names in the state (i.e. Asey Hole, Pig Basket Creek, Whynot). I am really looking forward to the book, which should come out sometime next year and will undoubtedly become another reference must-have.