The Berkeley School of Information is a global bellwether in a world awash in information and data, boldly leading the way with education and fundamental research that translates into new knowledge, practices, policies, and solutions.
The Master of Information and Data Science (MIDS) is an online degree preparing data science professionals to solve real-world problems. The 5th Year MIDS program is a streamlined path to a MIDS degree for Cal undergraduates.
The School of Information's courses bridge the disciplines of information and computer science, design, social sciences, management, law, and policy. We welcome interest in our graduate-level Information classes from current UC Berkeley graduate and undergraduate students and community members. More information about signing up for classes.
I School graduate students and alumni have expertise in data science, user experience design & research, product management, engineering, information policy, cybersecurity, and more — learn more about hiring I School students and alumni.
There are growing discussions within the research community about how to adapt study design given the widespread availability of Generative Artificial Intelligence (GenAI), including Large Language Models (LLMs). While much prior research has focused on LLM use from a researcher perspective (e.g. detecting and screening for LLM use) we present a complementary study from the perspective of…
AI systems and tools today can generate human-like expressions on behalf of people. It raises the crucial question about how to sustain human agency in AI-mediated communication. We investigated this question in the context of machine translation (MT) assisted conversations. Our participants included 45 dyads. Each dyad consisted of one new immigrant in the United States, who leveraged MT for…
Our study of 20 knowledge workers revealed a common challenge: the difficulty of synthesizing unstructured information scattered across multiple platforms to make informed decisions. Drawing on their vision of an ideal knowledge synthesis tool, we developed Yodeai, an AI-enabled system, to explore both the opportunities and limitations of AI in knowledge work. Through a user study with 16 product…
Eliciting youth perspectives on technology presents unique challenges, as traditional research methods often feel formal, abstract, or disconnected from teens’ lived experiences. Building on HCI research engaged with present-day sociotechnical experiences, our work examines teens’ impressions of their technological futures through design fiction. In this study, we conducted design fiction…
We introduce Being The Creek, a mobile augmented reality (MAR) experience that invites participants to take a “first-person” perspective of a historically-significant-creek by lying alongside her and getting attuned to her environment through embodied multisensory engagement. Individuals experience how the world might appear from the Creek’s perspective, from the pre-colonial respect she received…
Recent advancements in general-purpose AI have highlighted the urgent need to align AI systems with the goals, ethical principles, and values of individuals and society. Existing alignment research has been primarily approached as an AI-centered, static, and unidirectional process. However, this unidirectional perspective falls short of taking into account the dynamic and evolving interaction…
We describe a large-scale dataset - DeepSpeak - of real and deepfake footage of people talking and gesturing in front of their webcams. The real videos in this dataset consist of a total of 50 hours of footage from 500 diverse individuals. Constituting more than 50 hours of footage, the fake videos consist of a range of different state-of-the-art avatar, face-swap, and lip-sync deepfakes with…
As generative artificial intelligence (AI) continues its ballistic trajectory, everything from text to audio, image, and video generation continues to improve at mimicking human-generated content. Through a series of perceptual studies, we report on the realism of AI-generated voices in terms of identity matching and naturalness. We find human participants cannot consistently identify recordings…
Using computational methods, we investigate a data set of 874,125 sentences from 30 U.S. history textbooks used in California and Texas schools to consider how they discuss Asians/Asian Americans. Only 1% of all sentences in our sample has any mention of Asians. Most of these sentences focus on Chinese and Japanese, and when individuals are named, they are usually White. The most prevalent topics…
Background: Blockchain technology has capabilities that can transform how sensitive personal health data are safeguarded, shared, and accessed in digital health research. Women’s health data are considered especially sensitive, given the privacy and safety risks associated with their unauthorized disclosure. These risks may affect research participation. Using a privacy-by-design approach, we…
Developing end-to-end bioinformatics workflows is challenging, demanding deep expertise in both genomics and computational techniques. While large language models (LLMs) provide some assistance, they often lack the nuanced guidance required for complex bioinformatics tasks, and are resource-intensive. We thus propose a multi-agent system built on small language models, fine-tuned on…
Cyber-attacks on healthcare entities and leaks of personal identifiable information (PII) are a growing threat. However, it is now possible to learn sensitive characteristics of an individual without PII, by combining advances in artificial intelligence, analytics, and online repositories. We discuss privacy threats and privacy engineering solutions, emphasizing the selection of privacy enhancing…
Understanding how pediatric patient room design influences comfort and care perceptions is critical for improving healthcare environments. This study examines the visual engagement and subjective experiences of 23 children (8–17 years) and 21 parents using a Virtual Reality (VR) and eye-tracking approach with photographic stimuli of thirty-two pediatric acute care rooms. A mixed-methods framework…
From a simple text prompt, generative-AI image models can create stunningly realistic and creative images bounded, it seems, by only our imagination. These models have achieved this remarkable feat thanks, in part, to the ingestion of billions of images collected from nearly every corner of the internet. Many creators have understandably expressed concern over how their intellectual property has…
Diagnosis of soil-transmitted helminthiasis and schistosomiasis for surveillance relies on microscopic detection of ova in Kato–Katz (KK) prepared slides. Artificial intelligence (AI)-based platforms for parasitic eggs may be developed using a robust image set with defined labels by reference microscopists. This study aimed to determine interobserver variability among reference microscopists in…
Online interpersonal harm, such as harassment and discrimination, is prevalent on social media platforms. Most platforms adopt content moderation as the primary solution, relying on measures like bans and content removal. These measures follow principles of punitive justice, which holds that perpetrators of harm should receive punishment in proportion to the offense. However, these strategies…
With drastic changes to abortion policy, the months following the Dobbs leak and subsequent decision in 2022 were a uniquely uncertain and difficult time for abortion access in the United States. To understand experiences of challenges to abortion access during that time, we used a hybrid inductive and deductive thematic coding approach to analyse descriptions of barriers and their impacts shared…
Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides…
Understanding how people perceive algorithmic decision-making is remains critical, as these systems are increasingly integrated into areas such as education, healthcare, and criminal justice. These perceptions can shape trust in, compliance with, and the perceived legitimacy of automated systems. Focusing on San Francisco’s decade-long policy of algorithmic school assignments, we draw on…
The months following the Dobbs v. Jackson Women’s Health Organization leak and subsequent decision in 2022, which removed federal legal protections of abortion, presented a challenging period for abortion in the United States and a time when sources of and experiences with social support likely changed rapidly. We used thematic analysis of randomly selected relevant posts from an abortion…