NEWS FROM THE FIELD |
Gerry McKiernan, Editor |
CONFERENCES
4th DUBLIN CORE METADATA WORKSHOP (DC-4)
March 3 - 5, 1997
National Library of Australia
Canberra, Austra1ia
http://www.dstc.edu.au/DC4/
In early March 1997, the National Library of Australia hosted and co-sponsored the fourth Dublin Core workshop. The Dublin Core Metadata Workshop series is an ongoing effort to create and refine a standard on the semantics for a simple description for networked resources to improve resource discovery on the Internet. The sixty-five workshop participants included content specialists, librarians, digital library researchers, and Internet networking specialists from twelve countries.
The Dublin Core Metadata Element Set emerged from the 1995 OCLC/NCSA Metadata Workshop, held at OCLC in Dublin, Ohio. The scope of this workshop was limited to identifying the semantics of a core set of descriptors for Web resources that could be thought of as "document-like-objects." A second workshop, co-sponsored by the UK Office of Libraries and Networking (UKOLN) and OCLC, resulted in the Warwick Framework, "a conceptual model for a container architecture for metadata packages of various types." As a result of subsequent extensive listserv discussions, the original 13-element Dublin Core set was increased to 15 elements with slightly modified element names. A reference description Dublin Core Metadata Element Set is avail able at URL: http://purl.org/metadata/dublin_core_elements .
The central issues for DC-4 were the formal identification of element structure, extensibility of the Dublin Core to other functional sets of metadata, and clearer definitions of the semantic content of certain elements. The Canberra workshop resulted in further elaboration of the metadata model that underlies the Dublin Core as well as clarification of the qualifiers required to implement this model. The fifth workshop in the series was held October 6-8, 1997 in Helsinki, Finland.
Background information, relevant articles, and workshop reports are available at the following sites:
The Dublin Core Homepage
http://purl.oclc.org/metadata/dublin_core
DC-5: Helsinki Workshop
http://linnea.helsinki.fi/meta/DC5.html
DC-4: The Canberra Workshop
DC-3: CNI/OCLC Image Metadata Workshop
http://purl.oclc.org/metadata/image
DC-2: OCLC/UKOLN Metadata Workshop
http://purl.oclc.org/oclc/rsch/metadataII
DC-l: OCLC/NCSA Metadata Workshop
http://purl.oclc.org/oclc/rsch/metadataI
CLINIC ON LIBRARY APPLICATIONS OF DATA PROCESSING
34th Annual Clinic
"Visualizing Subject Access for 21st Century Information Resources"
March 2-4,1997
University of Illinois at Urbana-Champaign
Graduate School of Library & Information Science
http://edfu.lis.edu/dpc.index.html
The Beckrnan Institute for Advanced Science and Technology on the University of Illinois at Urbana-Champaign campus was the site the 1997 Clinic on Library Applications of Data Processing (DPC'97) which this year focused on current and experimental alternative methods for organizing and retrieving distributed information sources. Among the general and specific questions addressed at DPC'97 were:
* What interface, browsing, and navigation tools are on the drawing board or in prototype systems that may help to improve subject access?
* Do the designers of digital library systems envision a role for more traditional library classification schemes and thesauri? If yes, how will they be made more visual and useful than they are now? If no, how will metadata and full text repositories be accessed and organized and what kinds of tools will provide term suggestion and representation of related concepts?
* What new tools exist to create visual displays of vocabulary choices and term relationships to improve browsing and search negotiation in either collections of full-text information or information surrogate files on the Internet, on CD-ROM, etc.?
* How will the new systems handle the Inter-space where switching vocabularies will be needed to access and search federated and unfederated repositories of full-text information in various languages?
* Has the cognitive research and user modeling efforts produced some results which could have an impact on subject access tool design?
Among the invited speakers drawn from various research and development communities in the United States and Europe were: Roland Hjerppe, LIBLAB, Department of Computer and Information Science, Linköping University, Sweden; Tamas Doszkocs, National Library of Medicine, Specialized Information Services Division, Bethesda, Maryland; Raya Fidel, Graduate School of Library and Information Science, University of Washington, Seattle; Jessica Mustead, JELEM Company, Brookfield, Connecticut; David Dubin, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign; Nicholas J. Belkin, School of Communication and Library Studies, Rutgers, The State University of New Jersey; Bryce Allen, School of Library and Information Science, University of Missouri-Columbia; Eric H. Johnson, Digital Library Initiative, Grainger Engineering Library Information Center, University of Illinois at Urbana-Champaign; Diane Vizine-Goetz, OCLC Inc., Dublin, Ohio; Richard Greenfield, Library of Congress; Bruce Schatz, Community Systems Laboratory, University of Illinois at UrbanaChampaign; Joseph Busch, Getty Information Institute, Santa Monica, California; and Gerry McKiernan, Curator, CyberStacks(sm), Iowa State University, Ames, Iowa. A hard-bound edition of the conference proceedings is in preparation. Summaries of presented papers and demonstration were published in the May 1997 issue of Library Hi Tech News.
DPC'97 was co-chaired by Pauline Atherton Cochrane, Research Professor, Graduate School of Library & Information Science, University of Illinois at Urbana-Champaign, and Eric H. Johnson, Research Programmer, Digital Library Initiative, Grainger Engineering Library Information Center, University of Illinois at Urbana-Champaign.
NORTH AMERICAN SERIALS' INTEREST GROUP (NASIG)
13th Annual Conference
"Head in the Clouds, Feet on the Ground:
Serials Vision and Common Sense"
June 18-21,1998
The North American Serials Interest Group (NASIG) will hold its 13th Annual Conference June 18-21, 1998, on the campus of the University of Colorado in Boulder. NASIG's annual conference provides a forum in which librarians, publishers, vendors, educators, binders, systems developers, and other serials specialists discuss matters of current interest and actively seek solutions to common problems. The proceedings are published in both print and electronic formats, with the electronic version made available on NASIGWeb ( http://nasig.ils.unc.edu ) to members.
The NASIG Program Planning Committee has invited proposals for plenary papers and preconferences that deal with both the visionary and practical aspects of the digital serials information age. Among the major themes expected to be explored at the conference are:
* Cataloging and organizing evolving forms of information
* Technological, structural, cultural issues of Web access
* Access issues and impact of e-journals on user behavior
* Preservation of digital formats for future generations
* Handling and management of the migration of publications to a digital format
* Selection criteria for online information
* Innovative partnerships for information management
PAPERS and ARTICLES
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY
The 1996 edition of the Annual Review of Information Science and Technology (ARIST) includes a review article by Jeannette Woodward on "Cataloging and Classifying Information Resources on the Internet." While many developments and enhancements have occurred since its publication, this chapter provides a good survey of key projects and proposals that have applied traditional library practices to organize Net resources. In addition, Woodward offers concise summaries of innovative efforts that have extended the underlying theory and philosophy of traditional practices to distributed collections of electronic documents.
Among the well-established efforts to organize the Web described in the ARIST chapter are:
* CATRIONA (Cataloguing and Retrieval of Information Over Networks)
* InterCat (The OCLC Internet Cataloging project)
* Alcuin (one of the early efforts to make use of the of the MARC 856 field to link cataloging records to original Internet resources)
The review also provides a concise summary of projects that have applied standard library classification schemes for organizing Net resources. Among the reviewed projects are the CyberDewey collection of David Mundie (DDC), CyberStacks(sm) (LCC), the virtual science and technology Reference library developed by Gerry McKiernan, and the BUBL Subject Tree, now superseded by BUBL Link, that had originally applied the Universal Decimal Classification (UDC) to organize Net resources. The chapter also includes profiles of early projects that investigated the automatic classification of Internet resources, most notably the Nordic WAI S/World Wide Web Project undertaken by Lund University Library, UB2 and National Technological Library of Denmark.
The chapter also includes a general discussion on metadata and metadata standards, most notably the Text Encoding Initiative (TEI), URC (uniform resource citation) and the Summary Object Interchange Format (SOIF) used by the Harvest system, and the possibility of their conversion to a MARC analog.
ATLANTIC PROVINCES LIBRARY ASSOCIATION
A more detailed and current review of efforts to catalog Internet resources can be found in "Cataloging Beyond by the Walls," a paper prepared by Charley Pennell, Head of the Cataloguing Division of the Queen Elizabeth II Library at the Memorial University of Newfoundland, Canada for the annual meeting of the Atlantic Provinces Library Association held May 25, 1997 at the university in St. John's.
Where Woodward offers a general summary of the potential value and significance of Internet resources, Pennell provides a detailed review of the types of Internet resources that are candidates for cataloging. Among these are electronic texts and journals, digital images, electronic databases, and multimedia sites. His review and comparison of metadata formats is similarly detailed, with full descriptions or examples of TEI, the Dublin Core and MARC. Pennell concludes with a general characterization of the inadequacies of current Internet search engines and a profile of the application and integration of the Internet resource locators within local as well as commercial integrated library systems. He also offers a good summary of the OCLC Internet Cataloging Project and the auxiliary tools developed or sponsored by OCLC to facilitate the cataloging of Net resources at the local level.
The paper includes an excellent bibliography of key publications as well as an appendix of resources for curious catalogers," a compendium of significant relevant conferences, electronic journals, electronic texts, electronic images, MARC and other metadata standards, and MARC and non-MARC "solutions for resource discovery." The full-text of "Cataloguing Beyond the Walls" as well as its associated bibliography and appendix are available at:
http://www.mun.ca/library/cat/catnet/index.html .
The site includes hotlinks to all appendix resources, as well to Pennell's own Cataloguer's Toolbox, a collection of resources used at the Memorial University for cataloging Web and non-Web resources.
SOFTWARE
ROADS vl.OO
In early August 1997, after an extensive period of beta testing, the ROADS project announced the release of ROADS version 1.00. Prior to its official release, ROADS v1.00 had been running successfully on the Social Science Information Gateway (SOSIG) service ( http://sosig.ac.uk ) for several weeks. The ROADS software provides a "tool-kit" for supporting subject-based information gateways, offering a Web-based template editor and administration center, browsable subject lists using multiple subject descriptors, a CGI-based search front-end to a WHOIS++ leaf node server, and highly configurable HTML output. All the scripts are written in Perl 5, which provides users within the ability to modify and extend the ROADS software to their own needs.
ROADS (Resource Organization And Discovery in Subject-based Services), is one of five dozen specialized projects funded by the eLib Program of the Joint Information Systems Committee of the UK Higher Education Funding Councils. Its overall objective is to design and implement a "user-oriented" resource discovery system. A compilation of Frequently Asked Questions (FAQs) about ROADS, a collection of papers and reports on the ROADS project, as well as copies of the ROADS v1.00 software and its associated documentation are available from the ROADS homepage at:
http://ukoln.bath.ac.uk/roads/
PROJECTS
INTERNET ARCHIVE
BREWSTER KAHLE
In a effort to provide future access to Internet resources, Brewster Kahle, inventor of the Wide Area information Service (WAIS) system, has founded the Internet Archive. The goal of the Internet Archive is to collect all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin boards, and downloadable software, and to store and to serve it as requested.
Crucial to archiving the Internet, and digital libraries in general, is the cost effective storage of terabytes of data while still allowing timely access. One terabyte is equivalent to 1000 gigabytes of data; one gigabyte is equal to 1000 megabytes. Kahle believes that the ongoing and expected reduction of the cost of digital storage technologies will enable his organization to preserve the Net even as the number of distributed electronic resources continues to increase over the coming years.
Kanle believes that as a comprehensive digital repository, the Internet Archive can provide services similar to those that have been provided historically by libraries and institutional archives, notably access to an "official copy of record" for historical study or legal use. The project site provides access to a variety of text and audio news items about the project as well as a link to Weh Archive 96, a project of the Internet Archive to collect and store the 1996 US Presidential Election materials on the Internet.
SCORPION PROJECT
Scorpion is a research project at OCLC that is exploring the combination of indexing and cataloging of electronic resources., based on the observation that these are complementary activities. Since subject information is key to advanced retrieval, browsing, and clustering, the primary focus of Scorpion is the building of tools for automatic subject recognition based on well known classification schemes such as the Dewey Decimal system. A thesis of Scorpion is that Dewey can be used to perform automatic classification of an item and denote relevant subject headings.
To assign subject codes to an electronic document using Scorpion, the document is issued as as a query against a Dewey Decimal system database using ranked retrieval. The results of the search are then treated as the subjects of the document. Through such applications, Scorpion holds the potential of reducing the cost of cataloging by presenting potential subjects to a human cataloger for consideration and assignment. Additional information explaining the Scorpion tools and experimental results as well as key reports and associated documentation are available at URL:
Access to a version of Scorpion system is available upon request. Users should contact Keith Shafer ( shafer@oclc.org ), Senior Research Scientist, Office of Research and Special Projects, OCLC Online Computer Library Center to obtain a password and ID.
SCOUT REPORT SIGNPOSTINTERNET SCOUT PROJECT
UNIVERSITY OF WISCONSIN-MADISON, COMPUTER SCIENCES DEPARTMENT
The Scout Report Signpost is a searchable and organized collection of more than 2,400 critical summaries of carefully selected Internet sites and mailing lists identified over the past three years by the Internet Scout Project. The summaries prepared for the project seek to provide an overall analysis of the selected Net resources, including general content, attribution, currency, availability, accessibility and presentation. Based at the University of Wisconsin-Madison. the goal of the Internet Scout Project is to provide improved resource discovery services to the higher education community. Internet Scout is funded by the National Science Foundation and is a project of the InterNIC.
The Signpost service offers three primary methods of organization, allowing users to:
* Search through Quick and Advanced interfaces
* Browse content by Library of Congress Subject Headings, and
* Browse content by Library of Congress Classification
A Quick Search function allows for free-text searching of all of the Scout Report summaries. Each abbreviated summary contains a direct link to the resource and a link to the Scout Report brief description. An Advanced Search feature allows for fielded text searching of cataloged Scout Report summaries, which currently number over 800 titles. Advanced Search allows for searching of title author, publisher, resource type, Library of Congress Subject Headings, Library of Congress Classification, language, and resource location for cataloged resources.
A Browse by Library of Congress Subject Headings function allows users to browse lists of alphabetically arranged subject headings for cataloged Internet resources. A display will show the selected subject heading, associated resource titles, and their respective URLs. There is also a link to an associated Signpost description and the original Scout Report summary.
The Browse by Library of Congress Classification feature allows users to browse cataloged resources by discipline according to the Library of Congress class codes which have been assigned to each resource. When a class code is selected, the display shows a list of resource titles in that subject area, along with the URL for each and a portion of the original Scout Report summary. Many resources have been assigned two different class codes to facilitate access to other major aspects of a resource. Unlike a standard classification assignment, resources cataloged in Signpost have only the lettered class codes of the LC Classification; no class number is included.
SUBJECT SPECIFIC SCOUT REPORTS
INTERNET SCOUT PROJECT
UNIVERSITY OF WISCONSIN-MADISON, COMPUTER SCIENCES DEPARTMENT
http://scout.cs.wisc.edu/scout/report/index.html
http://rs.internic.net/scout/report/index.html
The Internet Scout Project has announced the availability of Subject Specific Scout Report The goals of the Subject Specific Scout Reports are
* To expand the Internet Scout Project's existing current awareness services by providing the higher education community with information that is specific to a given field of study
* To continue research into the cataloging and archiving of electronic resources
with expansion of the content of the Scout Report Signpost, to include highly specialized resources
* To investigate and implement new subject-based delivery techniques on the Internet using the subject specific resources archived and cataloged in Signpost
Subject Specific Scout Reports for Science & Engineering, Social Science, and Business & Economics were made available in September 1997. The reports are compiled by librarians and content specialists and published every two weeks via e-mail and on the Internet Scout Web site. For further information, contact, Susan Calcari, Project Director and Managing Editor ( scal@cs.wisc.edu ) or Jack Solock Editor, Scout Report and Subject Specific Scout Reports ( jacks@cs.wisc.edu )
REPORTS
"The Role of Classification Schemes in Internet Resource
Description and Discovery"
Traugott Koch and Michael Day, Principal Authors
DESIRE - RE 1004
February 1997
http://www.ukoln.ac.uk/metadata/DESIRE/classification
A report on the role of library classification schemes for aiding information retrieval within a network environment, was one of three studies published this spring under the auspices of the Metadata Group of UKOLN, the United Kingdom Office for Library and Information Networking based at the University of Bath, and DESIRE, a European Community-funded telematics research program.
This comprehensive review provides a detailed analysis of the issues surrounding the application of different types of classification systems to enhance the identification and use of these resources. Beyond Bookmarks: Schemes for Organizing, the Web-based clearinghouse of sites that have applied standard and non-standard library classification schemes for organizing Web resources maintained by Gerry McKiernan, served as a core resource for this study. The Dewey Decimal Classification (DDC), Universal Decimal Classification (UDC), Library of Congress Classification (LCC) Engineering Information (Ei), Mathematics Subject Classification, and ACM Computing Classification System (1991) are among the major and specialized systems examined in the report.
Access to this report arid others prepared under the Work Package 3 (WP3) of the Telematics for Research project within DESIRE are available at URL:
http://www.ukoln.ac.uk/metadata/publications.html
In addition to providing access to other specialized project reports, the site offers access to several reviews and articles devoted to metadata formats.