Archive for the 'Digital Libraries' Category

22
Dec
11

Musings: SAA, DAS, and “Managing Electronic Records in Archives & Special Collections”

This afternoon I successfully completed the electronic exam for “Managing Electronic Records in Archives & Special Collections,” a workshop presented as part of SAA‘s Digital Archives Specialist program. With my new certificate of continuing education in hand, I wonder how much I should/could participate in the DAS program. I have been watching the development of the program with great interest, particularly the cost, expected completion timeline, and who the experts would be. I signed up for the course and ventured up to Pasadena for a two-day workshop with Seth Shaw and Nancy Deromedi.

Erica Boudreau has a good summary of the workshop as taught by Tim Pyatt and Michael Shallcross on her blog, so I will try not to repeat too much here. Of interest to those looking to learn more about e-recs is the Bibliography and the pre-readings, which consisted of several pieces from the SAA Campus Case Studies website. We were asked to read Case 2, “Defining and Formalizing a Procedure for Archiving the Digital Version of the Schedule of Classes at the University of Michigan” by Nancy Deromedi, and Case 13, “On the Development of the University of Michigan Web Archives: Archival Principles and Strategies” by Michael Shallcross, as well as “Guarding the Guards: Archiving the Electronic Records of Hypertext Author Michael Joyce” by Catherine Stollar.

On the first day, the instructors discussed electronic “recordness,” authenticity/trust, the OAIS and PREMIS models, advocacy, and challenges, and reserved time for the participants to break into groups to discuss the three case studies. On the second day, we dove into more practical application of e-records programs, in particular a range of workflows. One of the takeaway messages was simply to focus on doing something, not waiting for some comprehensive solution that can handle every variety of e-record. Seth displayed a Venn diagram he revealed at SAA this year, which separates “fast,” “good,” and “cheap” into three bubbles — each can overlap with one other focus area, but not both. That is, for example, that your workflow can be cheap and good, but not fast; good and fast but not cheap, et cetera.

Seth and Nancy illustrated a multi-step workflow using a checksum creator (example used was MD5sums), Duke DataAccessioner for migration, checksums, as well as plugins for Jhove and Droid, WinDirStat for visual analysis of file contents, and FTKimager for forensics. They also discussed Archivematica for ingest and description, which still seems buggy, and web archiving using tools such as ArchiveIt, the CDL’s Web Archiving Service, and HTTrack. Perhaps the most significant thing I learned was about the use of digital forensics programs like FTKimager, as well as the concept of a forensic write blocker, which essentially prevents files on a disk/USB from being changed during transfer. Digital forensics helps us to see hidden and deleted files, which can help us provide a service to records creators — recovering what was thought lost — and creating a disk image to emulate the original disk environment. Also shared: Peter Chan at Stanford put up a great demo of how to process born digital materials using AccessData FTK on YouTube.  It was helpful to see these tools I have been reading about actually demonstrated.

Our cohort briefly discussed UC Irvine’s “virtual reading room,” which is essentially a way for researchers to access born-digital content in a reading room environment using DSpace, through a combination of an application process and limited user access period. Our rules of use are also posted. I have a lot of thoughts in my mind about how this may change or improve over time as we continue to receive and process born-digital papers and records — when we are doing less arrangement and better summarization/contextualization/description, how can we create a space for researchers to access material with undetermined copyright status? What will the “reading room” look like in the future?

Our digital projects specialist and I attended the workshop and I think we found some potential services and programs that could help us with our born-digital records workflow. Above all, it was helpful to see and hear about the tools being developed and get experienced perspectives on what has been working at Duke and Michigan. I enjoyed the review of familiar concepts as well as demonstrations of unfamiliar tools, and could see myself enrolling in future DAS courses. The certificate program includes an option to test out of the four Foundational courses, at $35 a pop. If I choose to complete the program, it must be done within 2 years, with a comprehensive exam ($100) that must be completed within 5 months after completing the required courses. Some people are cherry-picking from the curriculum, choosing only courses that are the most relevant to their work. I think a DAS certification could help train and employ future digital archivists (or, in my mind, archivists in general — since we’ll all be doing this type of work) and may create a “rising tide lifts all ships” type of situation in our profession. While there is a risk of a certification craze meant for financial gain of the organization, I was grateful to learn from experienced archivists in a structured setting. There’s something to be said for standards in education in our profession. I hope that DAS will raise the standard for (digital) archivists.

31
Aug
11

SAA Days 4 & 5: e-records, metrics, collaboration

Friday in Chicago started with coffee with Christian Dupont from Atlas Systems, followed by Session 302: “Practical Approaches to Born-Digital Records: What Works Today.” The session was packed…standing-room only (some archivists quipped that we must have broken fire codes with the number of people sitting on the floor)! Chris Prom from U Illinois, Urbana-Champaign, moderated the excellent panel on practical solutions to dealing with born-digital archival collections. Suzanne Belovari of Tufts referred to the AIMS project (which sponsored the workshop I attended on Tuesday) and the Personal Archives in Digital Media (paradigm) project, which offers an excellent “Workbook on digital private papers” and “Guidelines for creators of personal archives.” She also referenced the research of Catherine Marshall of the Center for the Study of Digital Libraries at Texas A&M, who has posted her research and papers regarding personal digital archives on her website. All of the speakers referred to Chris Prom’s Practical E-Records blog, which includes lots of guidelines and tools for archivists to deal with born digital material.

Ben Goldman of U Wyoming, who wrote an excellent piece in RB&M entitled “Bridging the Gap: Taking Practical Steps Toward Managing Born-Digital Collections in Manuscript Repositories,” talked about basic steps for dealing with electronic records, including network storage, virus checking, format information, generating checksums, and capturing descriptive metadata. He uses Enterprise Checker for virus checking, Duke DataAccessioner to generate checksums, and a Word doc or spreadsheet to track actions taken for individual files. Melissa Salrin of U Illinois, Urbana-Champaign spoke about her use of a program called Firefly to detect social security numbers in files, TreeSize Pro to identify file types, and a process through which she ensures that the files are read-only when moved. She urged the audience to remember to document every step of the transfer process, and that “people use and create files electronically as inefficiently as analog.” Laura Carroll, formerly of Emory, talked about the famous Salman Rushdie digital archives, noting that donor restrictions are what helped shape their workflow for dealing with Rushdie’s born digital material. The material is now available on a secure Fedora repository. Seth Shaw from Duke spoke about DataAccessioner (see previous posts) but mostly spoke eloquently in what promises to be an historic speech about the need to “do something, even if it isn’t perfect.”

After lunch, I attended Session 410: “The Archivists’ Toolkit: Innovative Uses and Collaborations. The session highlighted interesting collaborations and experiments with AT, and the most interesting was by Adrianna Del Collo of the Met, who found a way to convert folder-level inventories into XML for import into AT. Following the session, I was invited last-minute to a meeting of the “Processing Metrics Collaborative,” led by Emily Novak Gustainis of Harvard. The small group included two brief presentations by Emily Walters of NC State and Adrienne Pruitt of the Free Library of Philadelphia, both of whom have experimented with Gustainis’ Processing Metrics Database, which is an exciting tool to help archivists track statistical information about archival processing timing and costs. Walters also mentioned NC State’s new tool called Steady, which allows archivists to take container list spreadsheets and easily convert them into XML stub documents for easy import into AT. Walters used the PMD for tracking supply cost and time tracking, while Pruitt used the database to help with grant applications. Everyone noted that metrics should be used to compare collections, processing levels, and collection needs, taking special care to note that metrics should NOT be used to compare people. The average processing rate at NC State for their architectural material was 4 linear feet per hour, while it was 2 linear feet per hour for folder lists at Princeton (as noted by meeting participant Christie Petersen).

On Saturday morning I woke up early to prepare for my session, Session 503: “Exposing Hidden Collections Through Consortia and Collaboration.” I was honored and proud to chair the session with distinguished speakers Holly Mengel of the Philadelphia Area Consortium of Special Collections Libraries, Nick Graham of the North Carolina Digital Heritage Center, and Sherri Berger of the California Digital Library. The panelists defined and explored the exposure of hidden collections, from local/practical projects to regional/service-based projects. Each spoke about levels of “hidden-ness,” and the decisionmaking process of choosing partners and service recipients. It was a joy to listen to and facilitate presentations by archivists with such inspirational projects.

After my session, I attended Session 605: “Acquiring Organizational Records in a Social Media World: Documentation Strategies in the Facebook Era.” The focus on documenting student groups is very appealing, since documenting student life is one of the greatest challenges for university archivists. Most of the speakers recommended web archiving for twitter and facebook, which were not new ideas to me. However, Jackie Esposito of Penn State suggested a new strategy for documenting student organizations, which focuses on capture/recapture of social media sites and direct conversations with student groups, including the requirement that every group have a student archivist or historian. Jackie taught an “Archives 101″ class to these students during the week after 7 pm early in the fall, and made sure to follow up with student groups before graduation.

After lunch, I went to Session 702: “Return on Investment: Metadata, Metrics, and Management.” All I can say about the session is…wow. Joyce Chapman of TRLN (formerly an NC State Library Fellow) spoke about her research into ROI (return on investment) for manual metadata enhancement and a project to understand researcher expectations of finding aids. The first project addressed the challenge of measuring value in a nonprofit (which cannot measure value via sales like for-profit organizations) through A/B testing of enhancements made to photographic metadata by cataloging staff. Her testing found that page views for enhanced metadata records were quadruple those of unenhanced records, a staggering statistic. Web analytics found that 28% of search strings for their photographs included names, which were only added to enhanced records. In terms of cataloger time, their goal was 5 minutes per image but the average was 7 minutes of metadata work per image. Her project documentation is available online. In her other study, she did a study of discovery success within finding aids by academic researchers using behavior, perception, and rank information. In order from most to least useful for researchers were: collection inventory, abstract, subjects, scope and contents, and biography/history. The abstract was looked at first in 60% of user tests. Users did not know the difference between abstract and scope and contents notes; in fact, 64% of users did not even read the scope at all after reading the abstract! Researchers explained that their reason for ignoring the biography/history note was a lack of trust in the information, since biographies/histories do not tend to include footnotes and the notes are impossible to cite.

Emily Novak Gustainis from Harvard talked about her processing metrics database, as mentioned in the paragraph about the “Processing Metrics Collaborative” session. Her reasoning behind metrics was simple: it is hard to change something until you know what you are doing. Her database tracks 38 aspects of archival processing, including timing and processing levels. She repeated that you cannot compare people, only collections; however, an employee report showed that a permanent processing archivist was spending only 20% of his time processing, so her team was able to use this information to better leverage staff responsibilities to respond to this information.

Adrian Turner from the California Digital Library talked about the Uncovering California Environmental Collections (UCEC) project, a CLIR-funded grant project to help process environmental collections across the state. While metrics were not built into the project, the group thought that it would be beneficial for the project. In another project, the UC Next Generation Technical Services initiative found 71000 feet in backlogs, and developed tactics for collection-level records in EAD and Archivists’ Toolkit using minimal processing techniques. Through info gathering in a Google doc spreadsheet, they found no discernable difference between date ranges, personal papers, and record groups processed through their project. They found processing rates of 1 linear foot per hour for series level arrangement and description and 4-6 linear feet per hour for folder level arrangement and description. He recommended formally incorporating metrics into project plans and creating a shared methodology for processing levels.

I had to head out for Midway before Q&A started to get on the train in time for my return flight, which thankfully wasn’t canceled from Hurricane Irene. As the train passed through Chicago, I found myself thinking about the energizing and inspiring the projects, tools, and theory that comes from attending SAA…and how much I look forward to SAA 2012.

(Cross posted to ZSR Professional Development blog.)

31
Aug
11

SAA Days 2 & 3: assessment, copyright, conversation

I started Wednesday with a birthday breakfast with a friend from college, then lunch with a former mentor, followed by roundtable meetings. I focused on the Archivists’ Toolkit / Archon Roundtable meeting, which is always a big draw for archivists interested in new developments with the software programs. Perhaps the biggest news came from Merilee Proffitt of OCLC, who announced that ArchiveGrid discovery interface for finding aids has been updated and will be freely available (no longer subscription based) for users seeking archival collections online. A demo of the updated interface, to be released soon, was available in the Exhibit Hall. In addition, Jennifer Waxman and Nathan Stevens described their digital object workflow plug-in for Archivists’ Toolkit to help archivists avoid cut-and-paste of digital object information. Their plugin is available online and allows archivists to map persistent identifiers to files in digital repositories, auto-create digital object handles, create tab-delimited work orders, and create a workflow from the rapid entry dropdown in AT.

On Thursday, I attended Session 109: “Engaged! Innovative Engagement and Outreach and Its Assessment.” The session was based on responses to the 2010 ARL survey on special collections (SPEC Kit 317), which found that 90% of special collections librarians are doing ongoing events, instruction sessions, and exhibits. The speakers were interested in how to assess the success of these efforts. Genya O’Meara from NC State cited Michelle McCoy’s article entitled “The Manuscript as Question: Teaching Primary Sources in the Archives — The China Missions Project,” published in C&RL in 2010, suggesting that we have a need for standard metrics for assessment of our outreach work as archivists. Steve MacLeod of UC Irvine explored his work with the Humanities Core Course program, which teaches writing skills in 3 quarters, and how he helped design course sessions with faculty to smoothly incorporate archives instruction into humanities instruction. Basic learning outcomes included the ability to answer two questions: what is a primary source? and what is the different between a first and primary source? He also created a LibGuide for the course and helped subject specialist reference/instruction librarians add primary source resources into their LibGuides. There were over 45 sections, whereby he and his colleagues taught over 1000 students. He suggested that the learning outcomes can help us know when our students “get it.” Florence Turcotte from UF discussed an archives internship program where students got course credit at UF for writing biographical notes and doing basic archival processing. I stepped out of the session in time to catch the riveting tail-end of Session 105: “Pay It Forward: Interns, Volunteers, and the Development of New Archivists and the Archives Profession,” just as Lance Stuchell from the Henry Ford started speaking about the ethics of unpaid intern work. He suggested that paid work is a moral and dignity issue and that unpaid work is not equal to professional work without pay.

After lunch, I headed over to Session 204: “Rights, Risk, and Reality: Beyond ‘Undue Diligence’ in Rights Analysis for Digitization.” I took away a few important points, including “be respectful, not afraid,” that archivists should form communities of practice where we persuade lawyers through peer practice such as the TRLN guidelines and the freshly-endorsed SAA standard Well-intentioned practice document. The speakers called for risk assessment over strict compliance, as well as encouraging the fair use defense and maintaining a liberal take-down policy for any challenges to unpublished material placed online. Perhaps most importantly, Merrilee Proffitt reminded us that no special collections library has been successfully sued for copyright infringement by posting unpublished archival material online for educational use. After looking around the Exhibit Hall, I met a former mentor for dinner and went to the UCLA MLIS alumni party, where I was inspired by colleagues and faculty to list some presentation ideas on a napkin. Ideas for next year (theme: crossing boundaries/borders) included US/Mexico archivist relations; water rights such as the Hoover Dam, Rio Grande, Mulholland, etc; community based archives (my area of interest); and repatriation of Native American material. Lots of great ideas floated around…

(Cross posted at ZSR Professional Development blog.)

31
Aug
11

SAA Day 1: Collecting Repositories and E-Records Workshop

On Tuesday, I arrived in rainy Chicago and headed straight for the Hotel Palomar for the AIMS Project (“Born-Digital Collections: An Inter-Institutional Model for Stewardship”) workshop regarding born-digital archival material in collecting repositories. The free workshop, called “CREW: Collecting Repositories and E-Records Workshop,” included archivists and technologists from around the world to discuss issues related to collection development, accessioning, appraisal, arrangement and description, and discovery and access of born-digital archival materials.

The workshop program started with Glynn Edwards of Stanford and Gretchen Gueguen of UVa, who discussed collection development of born-digital records. The speakers suggested that both collection development policies and donor agreements should have clear language about born-digital material, including asking donors to contribute metadata to electronic records from his/her collection. The challenge, they note, is in collaboratively developing sound guidelines and policies to help archivists/curators make decisions about what to acquire. A group discussion about talking to donors about their personal digital lives and creating a “digital will,” both of which help provide important information about an individual’s work, communication, and history of using technologies.

Kevin Glick and Mark Matienzo from Yale and Seth Shaw from Duke discussed accessioning, the process through which a repository gains control over records and gathers information that informs other functions in the archival workflow. While many of the procedures for accessioning born-digital material is the same for analog material, the speakers distinguished accessioning the records from accessioning the media themselves (ie the Word document versus the floppy disk on which it is saved). Mark described his process of “re-accessioning” material through a forensic (or bit-level) disk imaging process, whereby he write-protected accessioned files to protect data from manipulation. He used FTK imager to create a media log with unique identifiers and physical/logical characteristics of the media, followed by BagIt to create packages with high level info about accessions. Seth discussed Duke’s DataAccessioner program, which he created as an easy way for archivists to migrate and identify data from disks. A group discussion asked: what level of control is necessary for collections containing electronic records at your institution? and, what are the most common barriers to accessioning electronic records, and how would they show up? Our table agreed that barriers include staffing (skills and time); being able to read media; software AND hardware; storage limits; and greater need for students/interns.

Simon Wilson from Hull, Peter Chan from Stanford, and Gabriela Redwine from the Harry Ransom Center at UT Austin discussed arrangement and description. They questioned whether archivists can appraise digital material without knowing content therein, which conflicts with the high-level, minimal processing emphasized in our field in the past few years. Another major issue is with volume: space is cheap, but does that mean archivists shouldn’t appraise? It isn’t practical to describe every item, but how will archivists know what is sensitive or restricted? Hypatia provides an easy-to-use interface that allows drag-and-drop for easy intellectual organization of e-records, as well as the ability to add rights and permissions information. Peter Chan described a complex method for using a combination of AccessData FTK in combination with TransitSolution and Oxygen to compare checksums, find duplicate records, and do a “pattern search” for sensitive terms and numbers (such as social security numbers). Gabi Redwine explored her work with a hybrid collection (analog and digital records) where she learned that descriptive standards should be a learning process for staff, not students or volunteers. Her finding aids for the collection included hyperlinks to electronic content and she advocated for disk imaging. The group discussion following this session was intense! The hotbed topic was: are professional skills of appraisal, arrangement, description still relevant for born digital materials? Our group agreed that appraisal and description remain important; however, we were strongly divided about whether archivists will need to contribute to arrangement of e-records. I believe that arrangement becomes less important as things become more searchable, as argued in David Weinberger’s Everything is Miscellaneous. Arrangement emerged before the digital realm as a way for archivists and librarians to contextualize and organize material based on topics/subjects; however, with better description, users can create their own ways of organizing e-records!

Finally, Gretchen Gueguen (UVa) and Erin O’Meara of UNC Chapel Hill discussed discovery and access. Our goals as archivists include to preserve original format and order as much as possible, and apply restrictions as necessary, while balancing this with our mission to make things accessible and available. Gretchen suggested the idea of Google Books’ “snippet” idea as a way to provide access without compromising privacy or restrictions on sensitive material. Her models for access for digital material include: in-person versus not; authenticated versus not; physical versus online access; and dynamic versus static. Erin described her use of Curator’s Workbenchwithin FOXML and Solr to control access permissions and assign restrictions and roles to e-records. Another group discussion included chewy scenarios for dealing with born-digital materials; my table had to consider: “you are at a large public academic research library; director brings several CDROMs, Zip disks and floppy disks of famous (secretive) professor from campus; they are backup files created over the years; office has more paper files; professor and his laptop are missing; no one can give further details on files; write 1 page plan for preserving/describing files; working institutional repository exists.” With no donor agreement and an understanding that the faculty member was very private, we couldn’t go ahead with full access of the material.

At the end of the day, I left with a much better grasp of how I see myself as an archivist dealing with born-digital material (primarily those on optical and disk media). It seems that item-level description works best for born-digital while aggregate description works best for analog materials. Digital records are dealt with best through collaboratively-created policies and procedures for acquiring, processing, and describing them. Great stuff!

Here is the suggested reading list to help participants prepare for the course:

(Cross posted to ZSR Professional Development blog.)

*Update: all of the workshop presentations have been posted to the born digital archives blog.

15
Jun
11

Teaching digitization for C2C

Most of this post is duplicated on the Professional Development blog at my institution.
I recently volunteered to help teach a workshop entitled “Preparing for a Digitization Project” through NC Connecting to Collections (C2C), an LSTA-funded grant project administered by the North Carolina Department of Cultural Resources. This came about as part of an informal group of archivists, special collections librarians, and digital projects librarians interested in the future of NC ECHO and its efforts to educate staff and volunteers in the cultural heritage institutions across the state about digitization. The group is loosely connected through the now-defunct North Carolina Digital Collections Collaboratory.

Late last year, Nick Graham of the North Carolina Digital Heritage Center was contacted by LeRae Umfleet of NC C2C about teaching a few regional workshops about planning digitization projects. The workshops were created as a way to teach smaller archives, libraries, and museums about planning, implementing, and sustaining digitization efforts. I volunteered to help with the workshops, which were held in January 2011 in Hickory as well as this past Monday in Wilson.

The workshops were promoted through multiple listservs and were open to staff, board members, and volunteers across the state. Each workshop cost $10 and included lunch for participants. Many of the participants reminded me of the folks at the workshops for Preserving Forsyth’s Past. The crowd was enthusiastic and curious, asking lots of questions and taking notes. Nick Graham and Maggie Dickson covered project preparation, metadata, and the NC Digital Heritage Center (and how to get involved); I discussed the project process and digital production as well as free resources for digital publishing; and Lisa Gregory from the State Archives discussed metadata and digital preservation.

I must confess that the information was so helpful, I found myself taking notes! When Nick stepped up to describe the efforts of the Digital Heritage Center, which at this time is digitizing and hosting materials from across the state at no cost, I learned that they will be seeking nominations for North Carolina historical newspapers to digitize in the near future, and that they are also interested in accepting digitized video formats. Lisa also introduced the group to NC PMDO, Preservation Metadata for Digital Objects, which includes a free preservation metadata tool. It is always a joy to help educate repositories across the state in digitization standards and processes!

05
Apr
11

Society of NC Archivists meeting: Morehead City

Most of this post is duplicated on the Professional Development blog at my institution.

While many of my colleagues were in Philadelphia for ACRL, I traveled east to the coast of North Carolina for the joint conference of the Society of North Carolina Archivists and the South Carolina Archival Association in Morehead City.

After arriving on Wednesday around dinnertime with my carpooling partner Katie (Archivist and Special Collections Librarian at Elon), we met up with Gretchen (Digital Initiatives Librarian at ECU) for dinner at a seaside restaurant and discussion about digital projects and, of course, seafood.

On Thursday, the conference kicked off with an opening plenary from two unique scholars: David Moore of the NC Maritime Museum talked about artist renditions of Blackbeard, Stede Bonnet, and other pirates, as well as archival research that helped contextualize these works; Ralph Wilbanks of the National Underwater and Marine Agency detailed his team’s discovery of the H.L. Hunley submarine, including the Civil War-era men trapped inside.

Session 1 on Thursday, succinctly titled “Digital Initiatives,” highlighted important work being done at the Avery Center for African American Research at the College of Charleston, UNC Charlotte, and ECU. Amanda Ross and Jessica Farrell from the College of Charleston described the challenges and successes of digitization of material culture, namely slave artifacts and African artwork in their collections. Of primary importance was the maintenance of color and shape fidelity of 3-D objects, which they dealt with economically with 2 flourescent lights with clamps, a Nikon D80 with a 18-200 mm lens by Quantaray (although they recommend a macro lens), a tripod, and a $50 roll of heavy white paper. Their makeshift lab and Dublin Core metadata project resulted in the Avery Artifact Collection within the Lowcountry Digital Library. Kristy Dixon and Katie McCormick from UNC Charlotte spoke carefully about the need for strategic thinking and collaboration at a broad level for special collections and archives today, in particular creating partnerships with systems staff and technical services staff. They noted that with the reorganization of their library, 6 technical services librarians/staff were added to their department of special collections!

Finally, Mark Custer and Jennifer Joyner from ECU explored the future of archival description with a discussion about ECU’s implementation of EAC-CFP, essentially authority records for creators of archival materials. Mark found inspiration from SNAC, the Social Networks and Archival Context Project (a project of UVa and the California Digital Library) to incorporate and create names for their archival collections. Mark used Google Refine‘s cluster and edit feature to pull all their EAD files into one file, grabbed URLs through VIAF and WorldCat identities, and hope to share their authority records with SNAC. Mark clarified the project, saying:

Firstly, we are not partnered with anyone involved in the excellent SNAC project. Instead, we decided to undertake a smaller, SNAC-like project here at ECU (i.e., we mined our EAD data in order to create EAC records). To accomplish this, I wrote an XSLT stylesheet to extract and clean up our local data. Only after working through that step did we then import this data into Google Refine. With Refine, we did a number of things, but the two things discussed in our presentation were: 1) cluster and edit our names with the well-established, advanced algorithms provided in that product 2) grab more data from databases like WorldCat Identities and VIAF without doing any extra scripting work outside of Google Refine.

Secondly, we haven’t enhanced our finding aid interface at all at this point. In fact, we’ve only put in a few weeks’ worth of work into the project so far, so none of our work is represented online yet. The HTML views of the Frances Renfrow Doak EAC record that we demonstrated were created by an XSLT stylesheet authored by Brian Tingle at the California Digital Library. He has graciously provided some of the tools that the SNAC project is using online at: https://bitbucket.org/btingle/cpf2html/.

Lastly, these authority records have stayed with us; mostly because, at this point, they’re unfinished (e.g., we still need to finish that clustering step within Refine, which requires a bit of extra work). But the ultimate goal, of course, is to share this data as widely as possible. Toward that end, I tend to think that we also need to be curating this data as collaboratively as possible.

The final session of the day was the SNCA Business Meeting, where I gave my report as the Archives Week Chair. That evening, a reception was held to celebrate the award winners for SNCA and give conference attendees the opportunity to participate in a behind-the-scenes tour of the NC Maritime Museum. Lots of fun ensued during the pirate-themed tours and I almost had enough energy to go to karaoke with some other young archivists.

On Friday, I moderated the session entitled “Statewide Digital Library Projects,” with speakers Nick Graham from the NC Digital Heritage Center and Kate Boyd from the SC Digital Library. The session highlighted interesting parallels and differences between the two statewide initiatives. Kate Boyd explained that the SCDL is a multisite project nested in multiple universities with distributed “buckets” for description and digitization. Their project uses a multi-host version of CONTENTdm, with some projects hosted and branded specifically to certain regions and institutions. Users can browse by county, institution, and date, and the site includes teacher-created lesson plans. The “About” section includes scanning and metadata guidelines; Kate mentioned that the update to CONTENTdm 6 would help with zoom and expand/reduce views of their digital objects. Nick Graham gave a brief background on the formation of the NCDHC, including NC ECHO and its survey and digitization guidelines. He explained that the NCDHC has minimal selection criteria: simply have no copyright/privacy concerns and a title. The NCDHC displays its digital objects through one instance of CONTENTdm. Both programs are supported by a mix of institutional and government funding/support, and both speakers emphasized the value of word of mouth marketing and shared branding for better collaborative efforts.

Later that morning, I attended a session regarding “Collaboration in Records Management.” Jennifer Neal of the Catholic Diocese of Charleston Archives gave an interesting presentation about the creation of a records management policy for her institution. Among the many reasons to begin an RM program, Jennifer noted that it was likely the legal reasons that were most important, both federal and state (and in her case, organizational rules). She recommended a pilot RM program with an enthusiastic department, as well as a friendly department liaison with organizational tendencies. Jennifer came up with “RM Fridays” as a pre-determined method for making time to sort, shred, organize, and inventory the materials for her pilot department. Her metrics were stunning: 135 record cartons were destroyed and 245 were organized and sent off site. Kelly Eubank from the NC State Archives explained how the state archives uses ArchiveIt to harvest social media sites and websites of government agencies and officials. She then explored, briefly, their use of BagIt to validate GIS geospatial files as part of their GeoMAPP project.

It was great to meet and network with archival professionals from both Carolinas and learn about some of the innovative and creative projects happening in their institutions. Right now I am thinking about EAC, collaboration with tech services, CONTENTdm, and records management.

09
Jun
10

The NC Digital Heritage Center is (Finally) Here: Reflections

This morning, Nick Graham sent out a message to the North Carolina Library Association announcing DigitalNC.org, the new digital repository for primary resources across the state digitized at UNC Chapel Hill.  Nick, formerly of NC Maps, is the newly-appointed coordinator for the North Carolina Digital Heritage Center, a development which I have followed closely here at Touchable Archives. The focus of the NC Digital Heritage Center and its matching website, according to the site:

“The North Carolina Digital Heritage Center is a statewide digitization and digital publishing program housed in the North Carolina Collection at the University of North Carolina at Chapel Hill. The Digital Heritage Center works with cultural heritage institutions across North Carolina to digitize and publish historic materials online. Through its free or low-cost digitization and online hosting services, the Digital Heritage Center provides libraries, archives, museums, historic sites, and other cultural heritage institutions with the opportunity to publicize and share their rare and unique collections online. The Center operates in conjunction with the State Library of North Carolina’s NC ECHO (North Carolina Exploring Cultural Heritage Online) project. It is supported by the State Library of North Carolina with funds from the Institute of Museum and Library Services under the provisions of the Library Services and Technology Act.”

Some of you who are familiar with North Carolina may wonder, “what happened to NC ECHO?” Based on discussions with colleagues across the state, it looks as though NC ECHO no longer exists as it originated*. (*Since I am relatively new to the state as a librarchivist, I am still unclear about the original purpose of the NC ECHO Project. Two of the largest deliverables from NC ECHO include its survey and institutional directory and its LSTA digitization grant funding program.) The preservation and emergency response focus of NC ECHO has become NC Connecting to Collections and NC SHRAB’s Traveling Archivist program, as well as possible regional emergency response networks like MACREN. The digitization planning and project funding aspect of NC ECHO appears to have joined with UNC Chapel Hill to form the NC Digital Heritage Center.

In previous posts, I have been excited about this Digital Heritage Center being North Carolina’s version of the California Digital Library’s Calisphere. I originally thought that the CDL was a statewide initiative of the state library, but recently realized that it is, like the NCDHC, an initiative of a university system. The CDL is not a resource provided by the state library of California. It is a project of the University of California system. This is what the digital collections portal of the California State Library looks like; this is what the State Library of North Carolina’s digital repository looks like. Why do the statewide library and archives systems for these states have such limited digital resource, while academic libraries in these states carry digital collections technology and access into the future? Wouldn’t it make more sense for the state library to be the digital repository, instead of providing funding for it?

The obvious answer is that the state library does not have the technological resources or expertise to make this happen. Academic libraries and archives are research-oriented, so they are able to do more experimentation and use the knowledge of systems librarians and programmers to create new and innovative resources. Perhaps most importantly, the state library supports academic libraries that make these resources accessible, which is possibly the only reason I am willing to overlook the potential conflict of interest of having UNC and the state library so closely intertwined.

The NC Digital Heritage Center arrives at an exciting moment in the history of digital libraries and digital collections. The team and advisory board exist to provide project management, digitization, and web hosting to smaller and less-funded institutions in the state in order to create access to primary resources across the state. I hope that institutions both large and small can participate in this effort to create a statewide digital repository. In this way, resources from community-based institutions and repositories holding the history of underrepresented groups can be made available for research and review like never before. I continue to follow closely the development of the Center.

25
May
10

Digitization policies: drafts

In a few weeks, I will have been in my position here for four months. If there is one project that I hope to complete before my first year, it is to successfully create a sustainable digitization process for our library!

With feedback from the digital/web librarian who attempted to create a digitization policy about two years ago and a lot of reading, I created four documents to get our digitization “task force” talking about our project process. These documents, in draft form, are as follows:

  • Digital Collection Development Policy: This document is modeled after the original policy document. It describes types of digitization projects, defines a “digitization advisory group” that decides what projects to do and who will be part of the projects, as well as project selection criteria.
  • Digital Project Life Cycle: This document describes the process of identifying and implementing a digital project. Team roles are described, as well as technical and metadata specs (still in development).
  • Digitization Project Proposal: This is a very short form that groups can fill out to propose a digital project to the “digitization advisory group.”
  • Project Proposal Checklist: This is the checklist that the “digitization advisory group” would use to help the group decide on and prioritize digitization projects. Adapted from Syracuse University Library’s “Digital Library Project Proposal Checklist.”

There are other forms and policies, such as a work order submission form and copyright research policy — I have some great guidance from the Society of Georgia Archivists’ Forms Forum, which has a lot of excellent examples. Some of the other resources I consulted and adapted include:

For me, the development policy and life cycle documents are the most important. Once our “task force” comes to agreement on these documents, they can serve as the backbone for our projects, as well as evidence that we all support a long-term, collaborative digitization effort. Feedback and suggestions are welcome. Thank you for reading!

As an unrelated note, Touchable Archives is the blog of the month for May 2010 at Simmons’ GSLIS!

24
Mar
10

Joining the NC Digital Collections Collaboratory

I’m the newest contributor to the NC Digital Collections Collaboratory! Check out my premiere post about the challenges of creating a digital collections program in my new job. Please leave comments or suggestions — thanks!

16
Jan
10

Beautiful finding aids

Recently, I was presented with a challenge by a tech librarian. He asked me if I could think of any examples of special collections websites with appealing, user-friendly finding aids in EAD. One comment made: ”Archives seem to be the only places still doing a long narrative, like a printed document, on the web.”

My first response was to mention the Online Archive of California, but after that, I realized that my knowledge of visually appealing finding aid design and special collections websites was very limited.

The OAC is one of the first archival initiatives of its kind, because it attempts to digitally collocate archival resources in the state of California. Finding aids here are not only easily discovered through each repository’s website, but also through Google, ArchiveGrid, and OCLC (including OAIster when appropriate). Of course, the appealing interface doesn’t hurt the possibility of user discovery. The finding aids (here’s an example) have more visual interest through use of color blocks and links on the right side, as well as a sans-serif font. Perhaps the best part about a statewide interface? Consistency in design and usability.

The purpose of the site, however, is clear: to search finding aids (also referred to as collection guides). Digital content is tied to relevant collections with a small eyeball icon. Users can browse from A-Z and view brief collection descriptions. Overall the site has a clean interface with a simple purpose. The OAC’s collections are tied to the UC system’s Calisphere, which is a public- and educator-focused search site for over 150,000 digital objects (it also includes teacher modules for K-12). Both of these projects are powered by the California Digital Library.

Because my colleague was interested in EAD finding aids, I decided to start with SAA’s EAD Roundtable website. The site includes a list of early adopters of EAD, so I took a look at how creative some institutions were with representing their finding aids online.

My favorites so far?

Emory’s Manuscripts and Rare Books Library has a great search and browse interface. From the main page, users are informed that they can browse, search, and also search the catalog for resources. The database includes unprocessed collections, which is a pleasant surprise in the era of “hidden collections”. The finding aids themselves are visually interesting, with linked content, as well as icons for the PDF and printable versions (see the James D. Waddell papers for example).

Columbia University’s Archival Collections Portal searches both finding aids and digital content. I think this type of searching is natural for users, making it easier for users to access resources. The finding aids appear to be in a variety of formats depending on the collection, including HTML and PDF, but each record in the portal includes a descriptive summary and subject terms.

Both of these go against the typical left-side menu browsing of many EAD finding aids. I started to realize with my preferences that EAD was less important than the overall visual appeal and ease of use of the finding aid itself. If we can do a full-text search of any text document, why are we doing complex EAD encoding? Why aren’t we just doing HTML? How about catablogs? The idea is that, like MARC, having standards can help researchers find similar resources.

I’m at the beginning of understanding the many reasons to use EAD, but already I find myself questioning it. Jeanne over at Spellbound Blog talked about the possibilities of simpler EAD finding aids in 2008, through the Utah State Historical Society’s next-generation version of the Susa papers. There’s the Jon Cohen AIDS Research Collection, which is a finding aid and digital collection. Then there is the famous Polar Bear Expedition collection of next-generation finding aids.

There seems to be a lot of overlap between finding aids and digital objects, which I’ve seen at Duke and Eastern Carolina University, among others. Then there’s the movement to push our resources onto Flickr, Facebook, Twitter, etc. If repositories host their own finding aids and digital objects, they can repurpose and collocate them anywhere on the web, right?

I still don’t know if I have a good answer for my colleague. I know I have much to learn. I am curious to know…what’s is your favorite EAD finding aid site? The most beautiful finding aid site?




Follow

Get every new post delivered to your Inbox.