Reading for Bias: Computational Semantics and the Character of Racial Discourse
As machine learning becomes increasingly tasked with consequential real-world decisions, ever more concerns are raised about the kinds of social biases that it reinforces and perpetuates. Machine learning algorithms are not neutral observers of the world, but see what they are trained to see, which means seeing with all the human biases that data encodes. While industry and academic experts in the field have responded by considering how to make these algorithms more fair, cultural and other historians have begun using them to read for patterns of gender and racial bias in the archival record. This talk provides an example of what insights such readings might yield by using word embeddings to explore the semantics of racial discourse in a large corpus of Japanese periodicals and fiction written during the rise and fall of Japanese empire (1890-1960). I show what explorations of bias at larger scales can tell us about the character of racial discourse as interpretable pattern, whether by algorithm or human.
For machine learning experts, the problem of how to enhance fairness has often focused on interpretability. To make machine learning systems more interpretable is to make their discriminatory tendencies transparent, and thus subject to correction. For cultural historians, interpretability — understood to be an always situated and unevenly shared set of interpretive practices — is inseparable from the issue of how discrimination comes to be recognized in the first place. Whether and how these different approaches to interpretation can speak to one another is at least part of the project of cultural analytics as an emerging field. This talk takes up this challenge first by looking at how racial discourse under unequal power relations has often been read and then by building on these qualitative accounts to develop a quantitative model. I show how an understanding of racial discourse benefits from computational methods that leverage the very repetitive patterns on which such discourse depends. I also consider how an awareness of these patterns, when situated in theories of literary character, can provide interpretive openings into the moments where they break down.
Hoyt Long is associate professor of Japanese literature at the University of Chicago. He is the author of On Uneven Ground: Miyazawa Kenji and the Making of Place in Modern Japan (2012), and publishes widely in the fields of media history and cultural analytics. He co-founded the Chicago Text Lab with Richard Jean So and currently co-directs the Textual Optics Lab, which focuses on the creation of large-scale, multi-lingual text collections and the development of tools to explore them. His recent publications include “Race, Writing, and Computation: Racial Difference and the US Novel, 1880-2000” (Journal of Cultural Analytics, 2019) and “Self-Repetition and East Asian Literary Modernity, 1900-1930” (Journal of Cultural Analytics, 2018). He is currently completing a book manuscript, Figures of Difference, which reframes the history of modern Japanese literature through quantitative methods and their capacity to reason about difference across multiple scales.