Clustering and Classification

Friday, March 13, 1998

School of Information Management and Systems

UC Berkeley

This workshop will examine recent progress in clustering and categorization of textual information.

Agenda

8:20-8:30: Hal Varian "Welcome and logistics"

8:30-9:30 Ray Larson: "Cheshire II and Automatic Categorization"
See the Cheshire II page.

9:30-10:30 Isaac Cheng and Robert Wilensky: "Automatic Web Classification"
Based on the paper, "An Experiment in Enhancing Information Access by Natural Language Processing"

10:30-11:00 Coffee Break

11:00-12:00 Susan Dumais and David Heckerman: "Inductive Learning Methods for Text Classification"
Background in "Models and Selection Criteria for Regression and Classification" by David Heckerman and Christophe Meek.

12:00-1:00 Lunch

1:00-2:00 Mehran Sahami: "SONIA: A Service for Organizing Networked Information Autonomously"
See description on Mehran's homepage

2:00-3:00 Marti A. Hearst and Jan O. Pedersen: "Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results"
Based on the paper of the same name.

3:00-3:15 Cookies and sodas
Adjust blood sugar level.

3:15-4:00 Erik Brewer: "Integrating Search and Directories"
We have developed techniques to scale automatic classification to more than 100M documents. This enables a merger of search engines and directories that are more useful than either alone.

4:00-4:30 David Gibson: "Automatic Resource Complication by Analyzing Hyperlink Structure and Associated Text"
Authors: S.Chakrabarti, B.Dom, D.Gibson, J.Kleinberg, Prabhakar Raghavan and S.Rajagopalan.

6:00 PM dinner
Chinese Banquet at Hong Kong East Ocean, 3199 Powell Street, Emeryville, 655-388. Map and directions will be available.

Slides

PowerPoint presentations are available here.

Attendees

SIMS Affiliates