May 10, 2021

ScholarPhi: A Novel Interface for Reading Scientific Papers

To help scientists deal with the increasing volume of published scientific literature, a research team at the I School is designing ScholarPhi, an augmented reading interface that makes scientific papers more understandable and contextually rich.

The project is led by UC Berkeley School of Information Professor Marti Hearst, and includes UC Berkeley postdoctoral fellows Andrew Head and Dongyeop Kang, and collaborators Raymond Folk, Kyle Lo, Sam Sjonsberg, and Dan Weld from the Allen Institute for AI (AI2) and the University of Washington. It is funded in part by the Alfred P. Sloan Foundation and by AI2.

“The goal of this project is to help democratize understanding of complex scientific literature...This is our take on explainable AI.”
— Professor Marti Hearst

ScholarPhi broadens access to scientific literature by developing a new document reader user interface and natural language analysis algorithms for context-relevant explanations of technical terms and notation.

“The goal of this project is to help democratize understanding of complex scientific literature,” said Professor Hearst. “The AI literature is a case study; the papers are often technically dense. This is our take on explainable AI.”

The Challenge

The key challenge in designing the interface is coming up with interactions that show helpful information about terms and symbols without getting in a reader’s way. Given the difficulty of reading a scientific paper, small design gaffes can lead to unpleasant reading experiences.

With this challenge in mind, the team has designed an innovative interface with four features. First, the interface shows definitions of terms and symbols in compact tooltips. Second, it automatically diagrams equations, showing definitions of all symbols in the margins of any equation that a reader clicks. Third, it adds a priming glossary to the beginning of the paper that shows definitions of all terms and symbols in one place. Finally, it "declutters" the paper on demand—letting the reader select a symbol or term of interest, and lowlighting/hiding all of the passages that do not contain that symbol or term.


“Declutter” helps a reader search for information about a nonce word by low-lighting all sentences in the paper that do not include that word.


A lab study showed that when scholars used ScholarPhi, they could answer questions about a scientific paper in significantly less time, while viewing less of the paper in order to come to an answer, than with a standard document reader. Scholars found it easier to answer questions about the paper, and were more confident in their answers. They also reported they would use definition tooltips and equation diagrams frequently if they had them in their reading tools.

While the interface has only been evaluated in the lab with manually-edited definitions, the team is working to advance the state-of-the-art in automated definition recognition. The team is also working closely with the team at AI2 that develops the Semantic Scholar academic search engine to release an interactive reading application that will eventually make the reading experience available for millions of papers.

The ScholarPhi project appears as a full paper at the ACM CHI Conference on Human Factors in Computing Systems, a premier conference in human-computer interaction, the week of May 10, 2021. To learn more about the project, watch the video presentation for the CHI paper, or try out the online demo of the user interface.

ScholarPhi - screenshot from paper
ScholarPhi helps readers understand nonce words— unique technical terms and symbols—defined within scientific papers. Readers can click nonce words to access definitions for those words in a compact tooltip.
ScholarPhi - screenshot from paper
ScholarPhi generates equation diagrams and overlays them on top of display equations, affixing labels to all symbols and sub-symbols in the equation for which definitions are available.
ScholarPhi - screenshot from paper
ScholarPhi provides efficient, precise selection mechanisms for selecting mathematical symbols and their sub-symbols through single clicks, rather than error-prone text selections.


Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

If you require video captions for accessibility and this video does not have captions, click here to request video captioning.

Last updated:

May 10, 2021