The US Census and Social Media Archiving
Social media archiving at an institutional level (like the Library of Congress’s now-truncated effort to archive Twitter) is viewed as a grand challenge technical problem. Yet the future use of and modes of participation in institutional social media archives have received less attention. Social media users are generally averse to such archiving efforts, even if only public accounts and data are collected. Just a small minority perceive any long-term value of such an enormous effort; far more feel that the risk is unjustified (and continue to feel this way in the wake of scandals like Cambridge Analytica).
What can we learn from the US Census and the questions it was slated to answer in 1940, when 120,000 enumerators went door-to-door and conducted more than 37 million in-person household interviews? What stories does the census tell — both inadvertently and intentionally — 80 years later, now that the source data has been released from embargo? I will discuss use, participation, and confidentiality issues entailed by social media archiving through the lens of recent study results, coupled with examples drawn from contemporaneous reactions and responses to the 1940 US Census.
Cathy Marshall is an adjunct professor at the Center for the Study of Digital Libraries at Texas A&M University. She was previously a principal researcher at Microsoft Research, Silicon Valley.