UC Berkeley School of Information
UC Berkeley
Alumni Network
Site Map
Contact Us
Support the iSchool
iSchool intranet
  • About
    • Dean's Welcome
    • Mission and History
    • Visitor Information
    • Facilities
    • iSchool Jobs
    • News
    • Events
  • Programs
    • Programs Overview
    • Master's Program
    • Ph.D. Program
    • Courses
    • Centers & Clinics
  • People
    • Faculty
    • Students
    • Staff
    • Visitors
    • Alumni
  • Research
    • Research Overview
    • Faculty Research
    • Publications
  • Admissions
    • Admissions Overview
    • Choosing the iSchool
    • Applying to the iSchool
  • Careers
    • Careers Overview
    • Student Career Resources
    • Alumni Career Resources
    • Employer Resources
  • Partners
    • Partners Overview
    • Affiliates Program
    • Support the iSchool

Home > Research > Faculty Research

Faculty Research

The following faculty research projects are currently under way at the iSchool:

Bailando Projects

This is a suite of research projects related to user interfaces for search, text data mining and empirical computational linguistics, and automating web site evaluation.

  • FLAMENCO: We are exploring new ways to incorporate metadata into search interfaces.
  • BioText:Analyzing the biological scientific literature by creating new natural language processing algorithms, and designing new search tools to help biologists search for and synthesize information.
  • Massive Online Social Interaction:Studying online dating to better understand and improve computer-mediated communication.
  • Information Visualization:Creating innovative methods of visualizing abstract information effectively.

Center for Document Engineering Projects

  • XML and Modeling: This initiative is focused on closing the gap between the technical world of creating XML Schemas and the business-oriented world of modeling information flows, starting from better representations for XML Schema, through languages to extract information from XML Schema, up to the question of how schemas can be derived from business-level models and still be kept up-to-date.
  • Project M: Project M is an effort to unify complementary system analysis and development methodologies, in particular those of Document Engineering and User Centered Design. Each methodology is a workflow whose separate steps are the modeling activities that produce some normative model artifact (for example: use cases, business process models, document models, XML schemas, personas, wireframes). By harmonizing the semantics of the metamodels that describe these artifacts, we hope to faciitate their reuse and synthesis to contruct a complete domain and application model.
  • California E-Services: Historically, government services have not been customer-centric. Due to the intricacies of the organizational structure, these services are often distributed: department-based and not coordinated across responsible parties. Better and more user-friendly service could be provided via composite applications, designed around customer needs rather than the structure of the government. This project will analyze existing business registration system in the state of California and design recommendations for eServices Office initiative to improve the service that the state provides to new business registrants.

Cheshire II and III

The Cheshire projects develop advanced probabilistic search technologies for research and production Information Retrieval.

Economics-Informed Network Design

  • p2pecon@berkeley: Economics-informed design of peer-to-peer, ad-hoc and overlay networks
  • The 100x100 Project: Clean slate approach to network architecture design.
  • The Denali Project: Next generation scalable services for the global Internet.

How Much Information? 2003

This study is an attempt to measure how much information is produced in the world each year. We look at several media and estimate yearly production, accumulated stock, rates of growth, and other variables of interest. (See also the original "How Much Information?" study, released in 2000.)

Information Exchange and Development

This research explores how institutions and social structures shape local and global patterns of information exchange and learning—and their consequences for innovation and economic development.

  • From Brain Drain to “Brain Circulation”: This project examines how the shift from “brain drain” to “brain circulation” is transforming international patterns of technology development. It focuses on Silicon Valley’s skilled immigrants and their business and technical networks in China, India, Taiwan, and Israel.
  • Clusters in Transition: Toward global networks of innovation: Finland is typically seen as the model of a successful advanced economy that has established its innovative capacity through a strong clusters in IT, forest products, and machinery. This project argues that Finland’s competitiveness is now threatened by the limits of this cluster-based governance structure and explores the possibilities for a transition to a model based on global collaboration networks. This case has important implications for other advanced economies.

Metadata Research Program

The Metadata Research Program explores information retrieval in a networked environment. We design, build, and experiment with front-end prototypes, strategic search commands, entry vocabulary modules, and multi-database navigation.

  • Unfamiliar Metadata: Searching is likely to be most efficient when the searcher is familiar with the classification, categorizing, and metadata vocabularies being searched. The increase in network-accessible databases and the widespread adoption of metadata vocabularies mean that searches will increasingly be in metadata vocabularies that are unfamiliar to the searcher. This project will develop Entry Vocabulary Modules that accept topical statements in the searcher's terms and respond with a ranked list of terms in the system's vocabulary.
  • Seamless Searching of Numeric and Textual Resources: A research project to demonstrate improved access to textual material and numerical data on the same topic when searching two very different kinds of databases: bibliographical (for books, articles, patents, etc.) and numerical data-sets (socio-economic databases).
  • Translingual Information Management: Investment in the creation of online bibliographies and digital libraries has resulted in a body of tens of millions of pre-categorized and pre-classified records in all languages. The goal is to show how these resources can be used to improve crosslingual searching, information management, and resources for language engineering.
  • Support for the Learner: What, Where, When, and Who: When setting out to learn about a new topic, a well-tested practice is to follow the traditional "5Ws and the H": Who?, What?, When?, Where?, Why?, and How? — and to use the corresponding specialized library resources: biographical dictionaries, subject catalogs, chronologies, and gazetteers. The digital environment is still weak in providing an effective counterpart of the traditional reference library. The purpose of this project is to show how existing and emerging standards and protocols can be used or adapted to support learners with respect to What? Where? When? and Who? A client interface with links to existing specialized resources is being created for teachers seeking additional resources to supplement textbooks; and for contextualizing objects in library and museum collections by identifying the objects in other collections that are most closely related.
  • Bringing Lives to Light: Biography in Context: Cultural heritage, history, and social sciences are fundamentally about human activity. Everyone is interested in what other people do and have done. Life-stories are hard to beat as a basis for narrative and for engaging interest and biographies are regularly among the best-sellers. Not only History, but also Geography and most other subjects can come alive in the travelogues, journeys of discovery, and the life-stories of those involved. Science can be explained through the work of scientists. Engineering is routinely explained through the heroic struggles of inventors. Even natural history is often taught through the unfolding drama of the activities of an individual animal during its life-cycle or through the seasons of the year. The objective of this project is to design, demonstrate and evaluate techniques that would bring lives to light by revealing them in their contexts. In pursuit of this goal we are collaborating with scholars, archivists, and librarians to develop and enhance metadata structures that can be used to capture and record the events of individual lives and link them to the larger world of place, time, and topical context.

NSF Digital Libraries

Several SIMS faculty and students are participating in the UC Berkeley Digital Library project. The goal of this project is to develop the technologies for intelligent access to massive, distributed collections of multi-media documents including photographs, satellite images, videos, full text documents, and "multivalent" documents comprised of multiple terabyte databases.

RISC: Re-thinking Industry Structure under Convergence

This is an on-going collection of projects related to market structure in what traditionally were separate industries—voice telephony, data communications, video delivery, and internet access. Specific assignments have included:

  • Unbundled Network Elements
  • Reports on Public Access to Broadband Networks
  • Statement to California Public Utilities Commission En Banc Hearing Regarding Assessing and Revising the Regulation of Telecommunications Utilities
  • Competition in Wired Video Delivery