The following faculty research projects are currently under way at the iSchool:
This is a suite of research projects related to user interfaces for search, text data mining and empirical computational linguistics, and automating web site evaluation.
- FLAMENCO:
We are exploring new ways to incorporate metadata into search
interfaces.
- BioText:Analyzing
the biological scientific literature by creating new natural language processing
algorithms, and designing new search tools to help biologists search for and
synthesize information.
- Massive Online
Social Interaction:Studying online dating to better understand and
improve computer-mediated communication.
- Information
Visualization:Creating innovative methods of visualizing abstract
information effectively.
- XML and Modeling:
This initiative is focused on closing the gap between the technical world of creating XML
Schemas and the business-oriented world of modeling information flows, starting from better
representations for XML Schema, through languages to extract information from XML Schema,
up to the question of how schemas can be derived from business-level models and still be
kept up-to-date.
- Project
M: Project M is an effort to unify complementary system analysis and development
methodologies, in particular those of Document Engineering and User Centered Design. Each
methodology is a workflow whose separate steps are the modeling activities that produce some
normative model artifact (for example: use cases, business process models, document models,
XML schemas, personas, wireframes). By harmonizing the semantics of the metamodels that describe
these artifacts, we hope to faciitate their reuse and synthesis to contruct a complete domain
and application model.
- California
E-Services: Historically, government services have not been customer-centric.
Due to the intricacies of the organizational structure, these services are often distributed:
department-based and not coordinated across responsible parties. Better and more user-friendly
service could be provided via composite applications, designed around customer needs rather
than the structure of the government. This project will analyze existing business registration
system in the state of California and design recommendations for eServices Office initiative
to improve the service that the state provides to new business registrants.
The Cheshire projects develop advanced probabilistic search technologies for research
and production Information Retrieval.
Economics-Informed Network Design
This study is an attempt to measure how much information is produced in the world
each year. We look at several media and estimate yearly production, accumulated stock,
rates of growth, and other variables of interest. (See also the original "How Much Information?" study, released in 2000.)
Information Exchange and Development
This research explores how institutions and social structures shape local and global patterns of information exchange and learning—and their consequences for innovation and economic development.
- From Brain
Drain to “Brain Circulation”: This project
examines how the shift from “brain drain” to “brain
circulation” is transforming international patterns of technology
development. It focuses on Silicon Valley’s skilled immigrants and their
business and technical networks in China, India, Taiwan, and Israel.
- Clusters in Transition: Toward global networks of innovation:
Finland is typically seen as the model of a successful advanced economy that has
established its innovative capacity through a strong clusters in IT, forest
products, and machinery. This project argues that Finland’s competitiveness
is now threatened by the limits of this cluster-based governance structure and
explores the possibilities for a transition to a model based on global collaboration
networks. This case has important implications for other advanced economies.
The Metadata Research Program explores information retrieval in a networked environment. We design, build, and experiment with front-end prototypes, strategic search commands, entry vocabulary modules, and multi-database navigation.
- Unfamiliar
Metadata: Searching is likely to be most efficient when the searcher
is familiar with the classification, categorizing, and metadata vocabularies being
searched. The increase in network-accessible databases and the widespread adoption
of metadata vocabularies mean that searches will increasingly be in metadata
vocabularies that are unfamiliar to the searcher. This project will develop Entry
Vocabulary Modules that accept topical statements in the searcher's terms and
respond with a ranked list of terms in the system's vocabulary.
- Seamless
Searching of Numeric and Textual Resources: A research project to
demonstrate improved access to textual material and numerical data on the same
topic when searching two very different kinds of databases: bibliographical (for
books, articles, patents, etc.) and numerical data-sets (socio-economic databases).
- Translingual
Information Management: Investment in the creation of online
bibliographies and digital libraries has resulted in a body of tens of millions
of pre-categorized and pre-classified records in all languages. The goal is to
show how these resources can be used to improve crosslingual searching,
information management, and resources for language engineering.
- Support for the Learner:
What, Where, When, and Who: When setting out to learn about a new
topic, a well-tested practice is to follow the traditional "5Ws and the H": Who?,
What?, When?, Where?, Why?, and How? — and to use the corresponding specialized
library resources: biographical dictionaries, subject catalogs, chronologies, and
gazetteers. The digital environment is still weak in providing an effective counterpart
of the traditional reference library. The purpose of this project is to show how
existing and emerging standards and protocols can be used or adapted to support
learners with respect to What? Where? When? and Who? A client interface with links
to existing specialized resources is being created for teachers seeking additional
resources to supplement textbooks; and for contextualizing objects in library and
museum collections by identifying the objects in other collections that are most
closely related.
- Bringing Lives to Light: Biography in Context: Cultural heritage,
history, and social sciences are fundamentally about human activity. Everyone is
interested in what other people do and have done. Life-stories are hard to beat as
a basis for narrative and for engaging interest and biographies are regularly among
the best-sellers. Not only History, but also Geography and most other subjects can
come alive in the travelogues, journeys of discovery, and the life-stories of those
involved. Science can be explained through the work of scientists. Engineering is
routinely explained through the heroic struggles of inventors. Even natural history
is often taught through the unfolding drama of the activities of an individual animal
during its life-cycle or through the seasons of the year. The objective of this project
is to design, demonstrate and evaluate techniques that would bring lives to light by
revealing them in their contexts. In pursuit of this goal we are collaborating with
scholars, archivists, and librarians to develop and enhance metadata structures that
can be used to capture and record the events of individual lives and link them to the
larger world of place, time, and topical context.
Several SIMS faculty and students are participating in the UC Berkeley Digital Library project. The goal of this project is to develop the technologies for intelligent access to massive, distributed collections of multi-media documents including photographs, satellite images, videos, full text documents, and "multivalent" documents comprised of multiple terabyte databases.
RISC: Re-thinking Industry Structure under Convergence
This is an on-going collection of projects related to market structure in what traditionally were separate industries—voice telephony, data communications, video delivery, and internet access. Specific assignments have included: