Reflections on Gutenberg, the Internet and the Need for a (Paper!)
Journal on Internet Organization and Access
By Roger Brisson and Ruth Carter
In reflecting on the historical significance of the current digital revolution introduced by computer technology, it has become fashionable among information professionals to draw parallels with the development of printing during the Age of Gutenberg. Still others, with perhaps a greater flair for the dramatic, compare the current situation with that of the Wild West, a time when distinguishing between dynamic progress and anarchy proved a challenge for even the most astute saddle-bound social commentators. As historical metaphors both are appropriate in their own way, with each highlighting important aspects of the information revolution. Most of us still remember Gopher- after all it reached its hey-day only a couple of years ago, but like the Wild West towns that became small cities virtually overnight and then proceeded to disappear a short time later, Gopher sprang up only to be swept away by the greater allure of the World Wide Web. Gopher sites still exist and are used, but they too have taken on the patina of the ghost towns dotting the West today.
While the image of the Wild West does capture our more emotionalsentiments regarding current developments in information technology, drawing comparisons with the Age of Gutenberg is- at least for our profession- doubtlessly a more fruitful exercise. The middle decades of the 15th century represents a period in which a proliferation of ideas clustered around specific technologies. These included issues like the design characteristics of movable type, the make-up of paper and ink, and the means of efficiently impressing letters to paper. In like manner the first years of the 1990s will also be remembered as a time of rapid technological experimentation and innovation. The technology of the 15th century is of course very different from that of today, but the two ages nonetheless share the fundamental quality of being driven by a common goal: rapid advances in technology spurred or are spurring an unprecedented period of research and development, dedicated to finding the means for radically improving the ability to share information and knowledge.
Historically we now know that a number of factors, both technological and social, converged in the workshops of inventors like Johann Gutenberg for printing with movable type to establish itself. Perhaps the single most important influence socially was the need for reproducing texts more quickly and cost-effectively, a factor that is very much present in motivating current research in digital text. At the dawn of early modern Europe this demand could be felt in places like the expanding universities, where students struggled on their shoestring budgets to purchase expensive manuscripts.
Over the past couple of decades we have witnessed an explosion in the quantity of materials published, both in terms of the numbers of distinctive works produced, and in terms of the types of formats used for representing these works. More recently we- library professionals, information specialists, and the book-buying public- have also been confronted with the skyrocketing costs of books, serials, and other materials. Finally, libraries have become painfully aware that the costs of creating effective access to materials in the age of online catalogs and digital text makes up a significant portion of their budgets, with many libraries footing seven-digit bills annually for the upkeep of their technical services operations.
All of these factors, like those that drove the codex out of the scriptoriums of 15th century Europe, have led to our pursuing what, for a large number of people, seemed a completely unrealistic fantasy only a few years ago. The idea of a vast, interconnected digital library has been in circulation for some time; indeed, many trace its modern formulation to Vannevar Bush's classic paper, written in the 1930s and published in 1945, describing his hypertext-like Memex system. Research has been taking place in laboratories for several decades now, but it is only recently, with powerful desktop computers and the Internet, have a number of independent factors come together in creating the World Wide Web. Up until the 1990s the technology was simply too primitive to realize, in a practical sense, the ideas that had driven the research of the past 40 years. With each incremental innovation- much like putting together the pieces of a puzzle, the intensity of the research has also increased correspondingly, until in the late 80s and early 90s the numerous ideas circulating crystallized in places like the CERN labs in Switzerland, where Tim Berners-Lee not only 'proposed,' but actually developed the working prototype that a short time afterward would become the image-and text-laden World Wide Web.
One can only surmise how the printing revolution would have progressed if Gutenberg's contemporaries possessed the scholarly journal as the means for disseminating their discoveries and innovations. As it was, printers and their apprentices took a much more corporeal approach to sharing ideas: they traveled to one another's workshops on foot and picked up new methods by observation and discussion. Such a sharing of ideas was possible in a simpler world with a simpler technology; even if we disregard the complexity of today's technology, such a mode of dissemination as word of mouth would stretch the development cycle for the Web to decades instead of the months or years that we currently consider the norm. At the same time, it is interesting to note the area where the dissemination of ideas hasn't changed a great deal: by simple imitation and copying of one another's innovations. The beautiful roman italic and Greek typefaces created by the Venetian printer Aldus Manutius at the end of the 15th century were widely copied and quickly became a standard in 16th century printing. In like fashion aspiring Web authors visit one another's homepages with a sharp eye to Web layout and design.
Today computer and communications technology is developing at a breathtaking pace, and even with our numerous journals (and more recently the Web itself) information professionals have had their hands full attempting to keep up with the latest innovations. These words are being written using a five-pound laptop computer that in many ways possesses more power than mainframe computers of only ten years ago. This laptop is very mobile, reflected in the fact that these words are being written in a hammock in a garden with the laptop connected to a cellular telephone. With such a set-up this computer possesses full access, via the Internet and cellular radio, to information resources available internationally, from the Britannica Online, RLIN, the latest research at NASA, on through to the complete works of Shakespeare. E-mail can be sent to colleagues around the world, it is possible to participate in group discussions via listservs, to manipulate data on several Web sites (whose changes can be read instantly by anyone accessing the Web), one could even, heaven forbid, catalog books for one's library. Only a few years ago such possibilities would still have belonged to the world of science fiction! As much as it appeals to our sense of rugged individualism, the development of digital networks and computer technology, which has made possible such a scenario as being able to 'publish' a text internationally from the comfort of one's garden hammock, is not the product of one or two engineers' inventions. In retrospect, there is perhaps a romantic allure in imagining a solitary Gutenberg, working long hours in his workshop, bringing forth his printing press for the world to use after years of experimentation. Who knows, it is possible that in a couple of centuries people will also imagine the digital revolution as arising from the solitary work of, say, a Steven Jobs or a Bill Gates, laboring away in their respective parents' garage in Northern California or Seattle to bring forth the Computer Age. Like the more astute historians of the Age of Printing, however, we know that the advances being made today are the result of the efforts of thousands of people experimenting, researching, and sharing their ideas using a complex system of scholarly communication.
In the Age of Gutenberg's it did take a relatively short time for technology to mature to a state that would remain fairly constant until the first decades of the 19th century; the timeline was much longer, however, for publishing houses, the social institutions that would make effective use of printing press technology. It took another century, up to the founding of the Elsevier dynasty in the 1590s, for the modern system of publishing to develop into a form that we can recognize today. This pattern of rapid development in the technological infrastructure, followed by a longer period of learning to work with the technology and to develop social institutions to utilize it effectively, will in all likelihood be repeated in our era. If so, we can look forward to many years of Internet research and innovation.
As with other fields involved in the development of the new information infrastructure, librarians and other information professionals are actively researching ways how the Internet can serve the needs of users. At recent conferences and other meetings librarians have, with increasing intensity, voiced the need for a forum to share the results of scholarly research on the Internet. Such a forum would also provide an opportunity to report on innovative Web applications, which would allow a more efficient means of sharing interesting ideas. During the first couple of years, authors felt that the traditional library journals were not considered appropriate organs for publishing Internet-related research, so they made their articles and other reports available on the Web itself. For some this wasn't considered a problem at all, they were in fact anticipating what they considered the inevitable move to publishing electronically on the Web. These 'publications' were linked to their personal homepages, or were attached to the Web site of an author's home institution. It was assumed that most people interested in Internet research would have access to the Web anyway, and this kind of 'self-publishing' could take advantage of the perceived strengths of making information available on the Internet. Since article distribution was almost instantaneous, that is, a Web page could be made available internationally as soon as an author loaded it on a Web server, the traditional publishing time of several months to even years was virtually eliminated. The Web also offered an opportunity to rebel against what some believed was an entrenched professional establishment; by circumventing peer review innovative, unfamiliar, or even controversial ideas could be made available to colleagues. The Web, according to this view, represents a veritable revolution in scholarly communication, and would obviate the need for publishing information in traditional ways. This view considers the journal format of communication as passé, as nothing more than a relict of print culture.
Few would deny that making the results of one's research available via the Web has been of great value to those working with Internet-related topics. Because of the rapid progress in creating information-related services on the Internet, professionals have been able to reap the benefits of rapidly disseminating the results of one's research. This has certainly influenced the speed in which ideas are spread from institution to institution, and it may even be helping to motivate the publishers of paper-based scholarly journals to find ways of shortening the timeline to publication. Haworth Press, in recognizing the need to get scholarly research out to information professionals as quickly as possible, has committed to a publishing cycle for the Journal of Internet Cataloging that would have been thought impossible only a few years ago.
Journal publishers are indeed taking a serious look at the Internet as a platform for publishing their journals. With a few years of experience in using the Internet, however, we are now becoming painfully aware of some serious shortcomings. One of its great strengths, as a highly dynamic environment for text and image manipulation, is also one of its most serious weaknesses, at least as far as librarians and publishers are concerned. Just as rapidly as a Web site can be created, at the whim of its creator and the touch of a key it can also disappear. Books and journals are relatively stable, self-contained media for the preservation of text. Yes, librarians are very much aware of the serious preservation issues surrounding books, but they are nothing in comparison to the infinitely more complex issues surrounding the ongoing maintenance of electronic data-carriers (not to mention the infrastructure of high-speed computer networks) as preservers of the written word. We can gain a sense of the gravity of this problem if we look at the above-mentioned case of information professionals making their own articles and other material available via their personal or institutional homepages. How many of us have pointed to a given URL or returned to an author's site after a few months, only to discover that the URL is no longer valid? The author may simply have removed the article from the Web, but more likely- in this early phase of Web development- the author's home institution has upgraded and changed the address structure of the Web server. The 'dead-link' syndrome is a sensitive one for Web advocates, since it is certainly an indication that much more work needs to be done for the Web to develop into a reliable publishing platform. At the same time, what should we expect from these pioneering self-publishers? Since they have taken on the task of publishing their own work, should we come to expect that they will also take on other functions, like maintaining a permanent archival copy of their publications? In other words, should they be expected to take on all of the traditional responsibilities of libraries in terms of organization and access? It is understandable that personal homepages would be among the most volatile on the Web, since individuals will continue to tinker with their Web pages as they develop their Web authoring skills. Few of us, in any case, would expect that individuals should feel obligated to maintain an archival site of their personal articles in perpetuity. If authors do not archive their publications, who will? If no one archives this material, have we not wasted the permanent results of their research?
In the end, it has become fairly clear to all concerned with the Internet that an effective, reliable, and more permanent means for sharing the results of scholarly research on Internet-related topics is necessary. Listservs are extremely useful for informal exchanges like sharing tips and other advice; the Web has proven its worth for the rapid dissemination of papers and other informal 'reports from the field,' but there is still a glaring need for a platform to publish research contributions of enduring value. As a carrier for such research the paper-based journal is still the most effective way to document significant contributions to the field.
For these reasons, it did not take the co-editors of the Journal of Internet Cataloging long to agree to launch a new professional journal on the topic of Internet access and organization. The World Wide Web was just a couple of years old when discussions on a new journal began, but with its explosive growth and the resources that libraries and other institutions were investing in its use, it was already clear that a journal dedicated to Internet organization would fill an important need. It was inevitable that coinciding with this growth would be the appearance of reports on practical applications and more serious research studies. Authors were making their articles available in a wide variety of ways, from the above-mentioned personal Web page on through to traditional journals in librarianship and information science. The need for formalizing research in Internet organization and access through the existence of its own journal was becoming increasingly apparent.
Because of the concerns outlined above, we decided to begin the journal in paper format, and to seriously pursue incorporating the Internet in some way as a means of issuing the journal in the future. Indeed, this has already begun, and the homepage for the Journal of Internet Cataloging ,
http://www.libraries.psu.edu/iasweb/personal/rob/jic/jic.htm
is currently being used to publish abstracts of upcoming issues, general information on the journal, instructions to authors, and the like.
Starting a new professional journal is always a challenge requiring a healthy measure of simple 'rolling-up-the-sleeves' labor by both the journal's editors and its publisher. The excitement surrounding its creation is palatable for everyone involved, as well as, or perhaps because of, the element of uncertainty in its success. In the case of the Journal of Internet Cataloging, the excitement and risk are both heightened in that the subject matter of the journal is almost as new as the journal itself. In many ways the journal's success will parallel closely the success of the Internet. While most professionals enthusiastically supported a professional journal on organizing Internet resources, the editors have also heard from more hesitant, perhaps skeptical, individuals who have asked, 'another professional journal, why?' Not wishing to enter into an undertaking without strong support, the editors took this question seriously. While those who voiced an enthusiastic affirmation regarding the journal far outnumbered those who doubted the need for a journal so specialized, the editors felt it was important to address the question of the need for the journal.
Because of the volatile nature of the Internet, it would at first glance seem a risky undertaking to start a journal on Internet organization and access. What would be the focal issues for such a journal, what areas of Internet research would it closely follow? With the Internet developing so rapidly, isn't it likely that research would become quickly dated and hence irrelevant? Indeed, isn't it possible that the Internet itself could transform into something else, much like how Gopher was swept away by the Web, and thus leave an 'orphaned' journal devoted to something that no longer exists? For many it would thus seem too early in the history of this relatively new medium to begin publishing a journal dedicated to Internet research and applications. Those holding this view tend to believe that existing scholarly organs can readily assimilate the research focusing on the Internet.
Journal publishers are now regularly accepting what has become a steady flow of articles on Internet organization and access, but these articles must find a place alongside the myriad of other topics that are represented in these journals. Possessing its own forum for formal scholarly research would contribute to both enhancing and structuring research on Internet organization and access. The editors have eagerly moved forward with a journal on Internet cataloging precisely because a focused, professional journal during the early phases of a burgeoning field can be very useful strategically. It is in this early period that established means of scholarly communication can play an active, formative role in the development of a field.
Another concern has come from individuals who believe that a printjournal runs totally against the spirit of the new age of networked information, and as has already been noted above, some even question the need for the journal form itself as a means for structuring and disseminating research. If there is going to be a journal on Internet organization, it should be published on the Internet, not as a paper-based journal. Having an Internet-based journal would provide easy access for researchers active in the field, and it would allow authors to work fully in that environment, making it possible to link to other, related sites. Features of the Internet could also be demonstrated live, instead of just being described, as would be the case in the paper format.
Thus far, however, the trend appears to be to retain the journal form as a structuring principle even on the Web itself. In fact, both commercial and academic publishers have recognized the advantages in increased exposure by publishing their magazines or journals in both print and Internet-based versions. Indeed, even Yahoo! has found it expedient to publish a paper-based spin-off of its Web site entitled Yahoo!! Internet Life (published by Ziff-Davis). In addition to the advantages gained in increased exposure, publishers are aware that there are still many issues to be resolved before the Internet can be considered a reliable medium for publishing information. We have touched on some of these already, such as the archiving potential of a publishing medium (should publishers now be asked to maintain large storage servers for preserving their electronic journals?), and the practical question of who we can expect to actually possess access to the Web. Another important issue is to what degree the reading public is prepared, or willing, to read full-length articles or whole journal issues from a computer screen. A number of technical details still need to be addressed, such as how to provide access to journal issues (should they be sent to subscribers as in paper-based publishing, or should a single copy be made available via a Web server?). How will money transactions like subscription billing be carried out? Questions like these are relatively minor, however, and a number of creative solutions are currently being worked out by publishers.
Another issue that must be taken seriously in moving to the Web for journal publishing is professional acceptance of the new medium. One would think that librarianship and information science- professions on the cutting edge of the information revolution- would quickly accept a Web-based professional journal. While this may very well be true, experiments with Web-based publishing in other fields have shown that there is still strong resistance to the seeming ephemerality of a 'virtual' journal. While the mechanisms for producing a polished journal appear to migrate well to the electronic environment- a chief editor in charge of the administration of the publication, an editorial board, critical peer review, and the like- a strong undercurrent that questions the credibility of a Web-based journal is still very much present. Nonetheless, the advantages gained by moving to the Web are compelling, and the question is not whether we will see professional journals in our field on the Web, but rather a question of when and how.
The editors and the publisher spent much time considering an appropriate title for the journal, and the circumstances surrounding the decision were compounded by the rapidly evolving nature of the subject-matter itself, the Internet. Should the focus be on issues of organization and access only on the Internet, or should it encompass all digital information? Just as weighty a consideration was how we wished to define the scope of the journal: what specifically did we mean by organization and access, by 'cataloging?' Deciding to focus on the Internet came early and was a rather easy decision to make: while Internet-based material shares the same digital nature as other electronic media like CD-ROM or laser disc, the organization and access issues with the Internet are clearly of a very different nature than the other digital media. In any case, after much discussion the pool of possible titles was narrowed to two candidates: the 'Journal of Internet Cataloging,' and the 'Journal of Internet Organization and Access.' Behind what appeared to be an innocent enough choice between two equally appropriate titles, there were nuances whose implications were clear enough. The latter title, using a neutral language, would address a wide range of information professionals involved in the development of Internet services. On the other hand, a title signifying a traditional library activity, cataloging, would draw attention to a primary interest group and 'organizing' activity.
Behind the seemingly innocent choice of words here, however, lies a deeper meaning, one that will certainly come to play an important role in the ongoing development of the Internet. Cataloging implies a directed human activity that requires conscious intellectual effort; representations, whether a MARC record or a TEI header, are created by human effort before becoming a searchable entity in a database. The purpose of creating representations with human intervention is to enhance a record with controlled headings and descriptive fields that provide consistency in the record, and to reflect the intellectual content of an item. In contrast the terms 'Internet organization and access' not only refer to a broader range of activity, they also encompass the work of computer programmers and other specialists who have been developing the automated search engines- 'spiders', 'crawlers', and 'worms'- like those at AltaVista and Open Text for the Internet. It would seem to come down to the pointed question: is directed intellectual effort necessary in organizing data on the Internet, or can sophisticated search engines adequately satisfy users needs in finding the information they need? Two very different cultural traditions are involved here: on the one hand that of librarianship, possessing a long tradition of working closely with patrons and understanding their needs in providing effective access to information; on the other hand there is the fresh, new field of computer science, armed with powerful technology and innovative ideas.
While the interplay of these two traditions has played an important role in the development of information science of the past three decades, it is grossly misleading to characterize the history of library automation strictly in terms of the healthy tension between the two traditions. They have also worked closely together in developing online catalogs and other databases. In fact, the mingling of the two is what we have come to know as the body of knowledge concerned with information retrieval. One could characterize a middle group between librarians and computer scientists- information scientists- that have made information retrieval the focus of their research. Computer scientists, information scientists, and librarians have cooperated- and will undoubtedly continue to productively cooperate- in providing users with effective access to materials.
In the end, the words chosen for the full title of the journal betray much about the past development of information retrieval: Journal of Internet Cataloging: The International Quarterly of Digital Organization, Classification, & Access. As a new medium, the Internet will, more than ever, bring together the work of librarians and computer scientists. Because of its all-encompassing, inchoate nature, it will challenge information scientists to utilize the best in both traditions to provide effective access for users. There is little question, however, that the standard of quality to be followed will be found squarely within the rich tradition of librarianship, and hence the strong affirmation of cataloging in the journal's title. As computer scientists continue to develop increasingly sophisticated software for automating the organization of and access to Internet data, librarians will play a key role throughout the process, from providing specifications of the software's capabilities, on through to creating intuitive interfaces. Much of the work of the librarian will be to continue to orchestrate the provision of intellectual enhancements to cataloging records and to database organization. At the same time, the tendency of software development, as it continues to gain in sophistication, will be to automate a good deal of what we still consider intellectual activity. Ultimately, the successful synergy between the two groups will be a prerequisite to the Internet's ultimate success. A couple of years ago, just as the World Wide Web was making its presence felt, the University of Pennsylvania classicist James O'Donnell asked the pointed question of what the 'virtual library' of the future should contain. He went on to note, "Just to ask the question makes it suddenly obvious that one of the most valuable functions of the traditional library has been not its inclusivity but its exclusivity, its discerning judgment that keeps out as many things as it keeps in. In an information waterfall, the virtual library that tells us everything and sweeps us off our feet with a storm of data will not be highly prized. The librarian will have to be a more active participant in staving off infochaos . Whether the existing publishing or library communities will supply these pioneers, or whether they will come from some other sector of the information society is the radically open question of our time for all who care about words and how they affect people."
In affirming the importance of the tradition of cataloging, the journal will at the same time publish research from a wide variety of approaches to organizing Internet materials. Through its editorials and the scope of its articles, it will serve as a reminder that our goal as information professionals is not to preserve any particular tradition for tradition's sake, but rather- to borrow the words of Malcolm X- to strive toward effective organization and access by any means necessary. As the quote by O'Donnell makes clear, the realization of the vast digital library that the Internet promises to become will be of little use without also possessing a functional means for effectively gaining access to its materials. Once again Yahoo! can be taken as a good example of the mingling of the strengths found in both traditions, of library science and computer technology: Yahoo! receives daily requests to add sites to its highly structured catalog via e-mail from owners of homepages, or sites are sought out by its own Web spider. The site descriptions and their URLs are then taken by some 20 human classifiers who find a place for them in Yahoo!'s classification scheme. The usefulness of this approach is easily demonstrated by the some one million visits the site receives daily.
This, the first issue of the Journal of Internet Cataloging, contains the wide variety of contributions the editors would like to consider characteristic of the journal's scope. It will take the 'international' in its full title seriously, since the Internet is by its nature an international medium. As Beall's article demonstrates, it will also actively solicit research that directly pertains to public service issues, an understandable approach when one considers the active role that public service librarians themselves can play in the organization of Internet materials. We, the editors, hope the interesting variety of contributions in this inaugural issue is a harbinger of the future of both the journal and its chosen subject of focus.