Special Lecture

Modeling Language as Social and Cultural Data

Wednesday, May 7, 2025
12:00 pm - 1:30 pm
Lucy Li

Language models (LMs) are powerful because they embed social identities and beliefs. Their increasing capabilities have expanded the disciplinary overlap between AI and other fields, including those in the social sciences and humanities. My talk will illustrate how I’ve built reciprocal relationships between natural language processing (NLP) and two other fields: sociolinguistics and education. I’ll discuss how a sociolinguistic lens can inform model development, by surfacing implicit social preferences of pretraining data curation practices. In return, LMs can answer sociolinguistic research questions, uncovering the social dynamics of language at billion-word scale. Within education, I will discuss how LMs can support content analyses of school curricula. Then, I’ll show how I leverage educators’ in-domain expertise to create challenging multimodal benchmarks. Altogether, my work emphasizing social aspects of language contributes to both human-centered model development and empirical studies of social and cultural media.


This event will be live streamed on Zoom. You are welcome to join us either in South Hall or via Zoom. If this is your first time using Zoom, please allow a few extra minutes to download and install the browser plugin or mobile app.

Join the Zoom live stream

Last updated: May 1, 2025