Information Access Seminar

Update on the Social Networks and Archival Context (SNAC) Project

Friday, February 10, 2012
3:10 pm to 5:00 pm
Ray Larson and Brian Tingle

Archivists have a long history of describing the people who — acting individually, in families, or in formally organized groups — create and collect primary sources. They research and describe the artists, political leaders, scientists, government agencies, soldiers, universities, businesses, families, and others who create and are represented in the items that are now part of our shared cultural legacy. However, because archivists have traditionally described records and their creators together, this information is tied to specific resources and institutions.

The SNAC Project is using digital technology to "unlock" descriptions of people from descriptions of their records and link them together in interesting new ways. We are aggregating and interrelating those descriptions using EAC-CPF (the Encoded Archival Context — Corporate Bodies, Persons and Families). Work on the SNAC project is being conducted by a consortium consisting of The Institute for Advanced Technology in the Humanities (University of Virginia), the UC Berkeley I School, and the California Digital Library.

To represent the network of relationships between corporate bodies, persons and families, the merged EAC-CPF XML records have been processed into a graph database, which is used to power an interactive network visualization and generate linked data for publication as a SPARQL endpoint. The talk will review the Tinkerpop graph database stack and Apache Jena linked data technologies used for this processing. The graph and RDF data sets and APIs published by the project will also be described, including an overview of key sections of source code for the graph processing.

In this presentation we will describe and present an update on the SNAC project and demonstrate the public access interface for the SNAC database, including social network visualizations of SNAC persons, corporate bodies and families. The SNAC project is currently funded by the National Endowment for the Humanities and by a grant from the Mellon Foundation. We will also discuss future plans for the project.

Brian Tingle is technical lead for digital special collections at the California Digital Library.

Ray Lason in a professor in the School of Information

