The Berkeley School of Information is a global bellwether in a world awash in information and data, boldly leading the way with education and fundamental research that translates into new knowledge, practices, policies, and solutions.
The Master of Information and Data Science (MIDS) is an online degree preparing data science professionals to solve real-world problems. The 5th Year MIDS program is a streamlined path to a MIDS degree for Cal undergraduates.
The School of Information's courses bridge the disciplines of information and computer science, design, social sciences, management, law, and policy. We welcome interest in our graduate-level Information classes from current UC Berkeley graduate and undergraduate students and community members. More information about signing up for classes.
I School graduate students and alumni have expertise in data science, user experience design & research, product management, engineering, information policy, cybersecurity, and more — learn more about hiring I School students and alumni.
Large language model (LLM) applications such as agents and domain-specific reasoning increasingly rely on context adaptation -- modifying inputs with instructions, strategies, or evidence, rather than weight updates. Prior approaches improve usability but often suffer from brevity bias, which drops domain insights for concise summaries, and from context collapse, where iterative rewriting erodes…
Nutrition security, an emerging concept distinct from and complementary to food security, has gained increasing interest as a focus of efforts to reduce hunger, improve nutrition, and prevent diet-related health conditions. Yet, unlike food security, which has well-established measures, nutrition security lacks standardized measures for assessment. This gap hinders the ability to evaluate…
Data visualization is a core part of statistical practice and is ubiquitous in many fields. Although there are numerous books on data visualization, instructors in statistics and data science may be unsure how to teach data visualization, because it is such a broad discipline. To give guidance on teaching data visualization from a statistical perspective, we make two contributions. First, we…
Digital loans, which provide short-term, high-interest credit via mobile phones, have exploded in popularity across low- and middle-income countries. This paper reports the results of a randomized evaluation of a digital loan product in Nigeria. Being randomly approved for a loan (among those who otherwise would have been denied) substantially increases subjective well-being after 3 months, but…
Expertise in cognitively and motorically demanding tasks, such as indoor climbing and bouldering, is often associated with enhanced planning abilities, yet the specific relationship between cognitive and motor planning in such tasks remains underexplored. This study investigates how expertise influences route planning in bouldering, with a focus on the impact of problem difficulty. We asked…
Language shows up everywhere. It's in the digital content we circulate online, and it's in our conversations with each other. It's also in the training data and generations of language models, which are increasingly integrated into our everyday lives. Language is powerful because it embeds social identities and beliefs: it expresses who we are, and shapes our understanding of each other. Thus,…
Background: Abortion access in the United States has been in a state of rapid change and increasing restriction since the Dobbs v Jackson Women’s Health Organization decision from the US Supreme Court in June 2022. With further constraints on access to abortion since Dobbs, the internet and online communities are playing an increasingly important role in people’s abortion trajectories. There is a…
Following the leak of the Dobbs decision in 2022, abortion access in the United States has faced heightened barriers, including legal restrictions, financial constraints, and logistical challenges. In response, individuals seeking abortion care can employ innovative behavioral strategies to overcome these barriers and reshape their abortion experiences (ie, “behavioral innovations”). This paper…
Large language models (LLMs) have been extensively evaluated on medical question answering tasks based on licensing exams. However, real-world evaluations often depend on costly human annotators, and existing benchmarks tend to focus on isolated tasks that rarely capture the clinical reasoning or full workflow underlying medical decisions. In this paper, we introduce ER-Reason, a benchmark…
Conventional bag-of-words approaches for topic modeling, like latent Dirichlet allocation (LDA), struggle with literary text. Literature challenges lexical methods because narrative language focuses on immersive sensory details instead of abstractive description or exposition: writers are advised to “show, don’t tell.” We propose Retell, a simple, accessible topic modeling approach for literature…
Passively collected "big" data sources are increasingly used to inform critical development policy decisions in low- and middle-income countries. While prior work highlights how such approaches may reveal sensitive information, enable surveillance, and centralize power, less is known about the corresponding privacy concerns, hopes, and fears of the people directly impacted by these policies ---…
A series of recent papers demonstrate that mobile phone metadata can, together with machine learning, estimate the wealth of individual subscribers and accurately target cash transfer programs. In the context of an emergency cash transfer program in Haiti, we combine surveys and mobile phone call detail records (CDR) to test whether such methods can be used to estimate the program’s impact on…
The technology industry has long sought to diversify its workforce. This study evaluates one avenue that works against these efforts: the interaction between recruiter work practices and algorithmic recruiting tools. Through interviews and cognitive walkthroughs with fifteen recruiters, we find that recruiters—often under deadlines and quotas—develop shortcuts (e.g., computer science degrees and…
Social movement organizations, such as mutual aid groups, rely on technology to increase their influence, meet immediate needs, and address systemic inequalities. In this paper, we examine the role of technology in moments of crisis and the tensions mutual aid groups face when relying on tools designed with values that may be antithetical to their own. Through a qualitative study with mutual aid…
There are growing discussions within the research community about how to adapt study design given the widespread availability of Generative Artificial Intelligence (GenAI), including Large Language Models (LLMs). While much prior research has focused on LLM use from a researcher perspective (e.g. detecting and screening for LLM use) we present a complementary study from the perspective of…
AI systems and tools today can generate human-like expressions on behalf of people. It raises the crucial question about how to sustain human agency in AI-mediated communication. We investigated this question in the context of machine translation (MT) assisted conversations. Our participants included 45 dyads. Each dyad consisted of one new immigrant in the United States, who leveraged MT for…
Our study of 20 knowledge workers revealed a common challenge: the difficulty of synthesizing unstructured information scattered across multiple platforms to make informed decisions. Drawing on their vision of an ideal knowledge synthesis tool, we developed Yodeai, an AI-enabled system, to explore both the opportunities and limitations of AI in knowledge work. Through a user study with 16 product…
Eliciting youth perspectives on technology presents unique challenges, as traditional research methods often feel formal, abstract, or disconnected from teens’ lived experiences. Building on HCI research engaged with present-day sociotechnical experiences, our work examines teens’ impressions of their technological futures through design fiction. In this study, we conducted design fiction…
We introduce Being The Creek, a mobile augmented reality (MAR) experience that invites participants to take a “first-person” perspective of a historically-significant-creek by lying alongside her and getting attuned to her environment through embodied multisensory engagement. Individuals experience how the world might appear from the Creek’s perspective, from the pre-colonial respect she received…
Recent advancements in general-purpose AI have highlighted the urgent need to align AI systems with the goals, ethical principles, and values of individuals and society. Existing alignment research has been primarily approached as an AI-centered, static, and unidirectional process. However, this unidirectional perspective falls short of taking into account the dynamic and evolving interaction…