The Berkeley School of Information is a global bellwether in a world awash in information and data, boldly leading the way with education and fundamental research that translates into new knowledge, practices, policies, and solutions.
The Master of Information and Data Science (MIDS) is an online degree preparing data science professionals to solve real-world problems. The 5th Year MIDS program is a streamlined path to a MIDS degree for Cal undergraduates.
The School of Information's courses bridge the disciplines of information and computer science, design, social sciences, management, law, and policy. We welcome interest in our graduate-level Information classes from current UC Berkeley graduate and undergraduate students and community members. More information about signing up for classes.
I School graduate students and alumni have expertise in data science, user experience design & research, product management, engineering, information policy, cybersecurity, and more — learn more about hiring I School students and alumni.
Achieving any specific level of cybersecurity inevitably entails making compromises with regard to cost, function, and convenience, as well as trade‑offs
between societal values, such as openness, privacy, freedom of expression, and innovation. In defining regulations and incentives, decisions have to be made about how to balance these trade‑offs while optimizing security outcomes. To further…
Models of learning in EDM and LAK are pushing the boundaries of what can be measured from large quantities of historical data. When controlled randomization is present in the learning platform, such as randomized ordering of problems within a problem set, natural quasi-randomized controlled studies can be conducted, post-hoc. Difficulty and learning gain attribution are among the factors of…
This article presents a material ecosystemic approach as a theoretical grounding for understanding digital technologies as potential catalysts of socioeconomic development. Through such an approach, talk of “technology” is replaced by talk of the “material.” Material is understood as inclusive of the human-made as well as the natural, of human relationships, human bodies, and words spoken. And…
Consumer Protection regulators worldwide share basic problems: the companies that regulators police are so powerful and rich that fines do not matter. Consider the French with their €150,000 fine against Google in 2014. Efficacious fines against dominant platforms would have to rise to nine-figure levels to cause change, but consumer protection agencies generally lack the authority and political…
FTC Privacy Law and Policy is a broad-ranging primer on the FTC’s consumer protection mission. It is the first hundred-year history of the agency’s consumer protection activities, and it links consumer cases to modern internet privacy efforts. The book offers practical tips for lawyers, strategy for advocates, and insight for policymakers on the challenge of addressing unfair…
In this paper, we address issues of transparency, modularity, and privacy with the introduction of an open source, web-based data repository and analysis tool tailored to the Massive Open Online Course community. The tool integrates data request/authorization and distribution workflow features as well as provides a simple analytics module upload format to enable reuse and replication of analytics…
This article considers the issue of opacity as a problem for socially consequential mechanisms of classification and ranking, such as spam filters, credit card fraud detection, search engines, news trends, market segmentation and advertising, insurance or loan qualification, and credit scoring. These mechanisms of classification all frequently rely on computational algorithms, and in many cases…
In Miyamoto et al. (2015, this issue) the authors looked to substantiate the presence of the spacing effect, referenced from the psychology literature, in several MOOCs. Their secondary analyses constituted a robust, empirical finding on the correspondence between session distribution and certification but with only a coarse, analogous relationship to the theory of distributed practice. This…
Native advertising is the new term for “advertorials,” advertisements disguised as editorial content. Modern native advertising started in the 1950s, but its first uses were clearly signaled to the consumer. This paper explains why consumers might be misled by advertorials—even when labeled as such—when advertising material has elements of editorial content.
This dissertation investigates the roles of automated software agents in two user-generated content platforms: Wikipedia and Twitter. I analyze “bots” as an emergent form of sociotechnical governance, raising many issues about how code intersects with community. My research took an ethnographic approach to understanding how participation and governance operates in these two sites, including…
An examination of corporate privacy management in the United States, Germany, Spain, France, and the United Kingdom, identifying international best practices and making policy recommendations.
Barely a week goes by without a new privacy revelation or scandal. Whether by hackers or spy agencies or social networks, violations of our personal information have shaken entire industries, corroded…
Researchers associated with the UC Berkeley School of Information and School of Law, the Berkeley Center for Law and Technology, and the International Computer Science Institute (ICSI) released a workshop report detailing legal barriers and other disincentives to cybersecurity research, and recommendations to address them. The workshop held at Berkeley in April, supported by the National Science…
To explain the uncanny holding power that some technologies seem to have, this paper presents a theory of charisma as attached to technology. It uses the One Laptop per Child project as a case study for exploring the features, benefits, and pitfalls of charisma. It then contextualizes OLPC's charismatic power in the historical arc of other charismatic technologies, highlighting the…
In 1778, Vicesimus Knox declared his time the “Age of Information,” suggesting, in a fashion recognizable today, that the period had severed connections with prior ages. This paper examines Knox’s claim by exploring changes in conceptions of information across the eighteenth century. It notes in particular shifts in the concept’s personal and political implications, reflected in the different…
This dissertation grapples with the questions: Does transparency lead to accountability? Is it possible to "democratize" surveillance, turning surveillance into an instrument of democratic control over state bureaucracy? Can a state bureaucracy combine visions of surveillance within the state and "openness" to citizens to help police itself? To address these questions, I studied an "open…
An emerging field of educational data mining (EDM) is building on and contributing to a wide variety of disciplines through analysis of data coming from various educational technologies. EDM researchers are addressing questions of cognition, metacognition, motivation, affect, language, social discourse, etc. using data from intelligent tutoring systems, massive open online courses, educational…
Since the early-to-mid 2000's, South Africa's Western Cape and Kenya's capital city Nairobi have been attracting flows of trade and investments in information technology-enabled services (ITES). The flows are small but significant and growing, with multinational companies like Amazon, Google, IBM, and others locating and developing market niches in these regions. Why have these regions managed to…