Gesamtkunstvektoren: Perceptual Embeddings for the Performing Arts
Co-sponsored by the Berkeley Institute for Data Science and the School of Information
Deep neural models capable of encoding semantic aspects of multiple perceptual modalities offer exciting new opportunities for computational analyses of cultural expressive forms in the performing arts, particularly those that derive much of their effectiveness from the layering of different forms of media.
This presentation will detail efforts to expand existing inquiries into the analysis of recorded theater performances via the application of new technologies such as Meta’s Perception Encoder Audiovisual (PE-AV) family of models, which can generate aligned semantic vector embeddings of video, audio, and language modalities singly or jointly. These models raise the possibility of being able to augment computationally the consideration of works through the lens of the Wagnerian Gesamtkunstwerk, the notion that the worth of art can derive from the totality of its interweaved components (though first acknowledging the biases inherent to both the theory and to the AI models).
Of particular interest is how these models can complement and extend previous inquiries involving AI-based stylometry of pose and action in contemporary theater, as well as prior efforts to annotate manually the degrees of intermediality in the formal sections of specific recorded performances of Japanese Noh plays.
Speaker
Peter Broadwell
Peter Broadwell is the head of AI modeling and inference in Research Data Services at the Stanford University Libraries, where his team’s work applies AI and machine learning, web-based visualization, and other methods of digital analysis to complex cultural data.
He has a Ph.D. in musicology from UCLA and an M.S. in computer science from the University of California, Berkeley. Recently, he has contributed to projects involving automatic translation and indexing of folklore collections in multiple languages, deep learning-based analyses of theater choreography from video sources, and web-based parsing and playback of digitized player piano rolls.
