Laurel Jizba,
Editor
Susan Dumais: Senior Researcher, Adaptive Systems and Interaction Group, Microsoft Research
On September 17, 1999 Susan Dumais, Senior Researcher at Microsoft, gave a panoramic overview of information retrieval and management in a forward-looking, fast-paced talk that was punctuated with lots of ideas (and illustrated profusely with Microsoft PowerPoint slides) to a rapt audience at the American Society for Information Science (ASIS) Pacific Northwest Chapter conference. From her perspective, the big picture in information gathering includes significant elements that make the hunt for the “answers” an intricate, entangled, and multifaceted pursuit in a forest of meaning, both on and off the Internet.
According to Susan, who is a mathematician and
psychologist, the elemental issues in managing information can be corralled into
twin, interrelated categories: domain or object modeling and user or task
modeling. Domain and user modeling
compliment matching algorithm techniques, a mainstay of information retrieval
research. Matching users’ queries
with documents on the basis of content is obviously important in satisfying the
user need. But in addition, domain
and user modeling can provide important benefits. Key elements of domain
modeling include the inequality of objects, e.g., diversity among fields of
information; inter-object relationships (e.g. linking, text classification);
metadata: its presence, absence or misuse (e.g., spamming, defined as misleading
metadata or other Web content); and the longevity of information resources.
The realm of user/task modeling includes not only issues of demographics,
e.g., differing attributes and frames of reference for users, but also the often
overlooked but highly significant issue of presentation--how digital information
is visually presented to the user, etc. Underlying all are persistent vocabulary
disagreement problems among people, whether the words come from users, authors,
cataloguers, or indexers. The complexity of factors can be daunting.
Fortunately, Susan points out that the Internet comes
with features that assist in matching users and content. A hypertext link
structure, the ability to capture user behavioral patterns, filtering techniques
(including collaborative filtering), a large reservoir of data ripe for
mathematical analysis, automatic categorization and classification--as well as
many creative tools for interface design are all available. Still, she reminds
us that information gathering is often a time-consuming, complex task. She knows
that in a traditional library setting, there is no substitute for well-trained
librarians in guiding users. And, that human textual content classification
schemes like the Library of Congress
Subject Headings and the work underlying the Web search service,Yahoo! also
serve well. Still, the Internet is young, with much potential.
A few of the interesting techniques for information
analysis that Susan has studied include: semantic memory and neural networks;
filtering via modeling and intelligent agents; classification using machine
learning techniques; interactive user interfaces; productivity and quality of
worklife issues vis-à-vis technology; and category description and search.
So how did Susan Dumais, a human-computer interaction
expert and an inventor of latent semantic indexing, first become interested in
algorithms and interfaces for improved information retrieval? What
thought-provoking insights does she have about Internet text retrieval and
categorization, search and navigation, or intelligent agents that mimic human
abilities?
In Redmond, WA, on October 11, 1999, Susan gave a two-hour interview for the Journal of Internet Cataloging. In her standard issue, neutral-toned, nicely appointed Microsoft office, complete with an oversized monitor, built-in desk neatly spread with stuffed manila file folders, shelves holding neat rows of JASIS back issues, a small assortment of reference books, and window overlooking a corner of the corporate landscape, she talked about her background, her ideas and her current work at Microsoft. The interview follows.