{ix}

PREFACE

Index displays can often make browsing more effective than does repeated querying of a computer system by the user. Moreover, computer hardware now has the kind of speed needed for using full-screen online index displays efficiently. Thus future information system designers will need to give attention to how different types of index display can be generated. In particular, an understanding of string indexing systems will be essential. Yet, although much has been published on specific string indexing systems and on specific types of string indexing system, no general book on this topic has been available.

The audience that I anticipate for this first general book on string indexing is varied. My aims include both making a contribution to the theory of index displays and providing practical information on existing string indexing systems and on criteria for choosing a string indexing system.

In an academic setting, it might serve as a primary textbook in a special course on string indexing, or as a secondary text for courses on information retrieval, indexing, database management, or computer display design. More generally, it should be a useful reference work both for theoreticians and for anyone considering acquiring or designing an index display system.

People may wish to approach the book equipped with software or documentation, or both, for a specific string indexing system. Software for a version of the NEPHIS system, for example, is available for IBM-PC-compatible microcomputers; and more than one manual is available on the PRECIS system. Alternatively, readers may prefer to design their own string indexing system. In either case, working out examples for oneself should increase the benefits that one derives from reading the text.

To keep the subject matter within bounds, I have restricted string indexing systems to those in which not only are index entries produced according to regular syntactic rules but multiple overlapping entries are normally produced for an indexed item. Thus, chain indexing, for example, will only be noted as a relative to string indexing. I have also concentrated on more recent developments, with the results that not every early variant of KWIC is discussed and that the first version of PRECIS is omitted.  Another deliberate {x} limitation has been in the summary treatment of the many systems devised for producing library catalogue displays, whether on cards or on other media.

My approach has been to look at general principles and features rather than to devote chapters to individual string indexing systems. Brief introductions to the major string indexing systems, as well as to some near string indexing systems, will be given in Chapter 2, and examples from the various systems will be cited throughout; but readers should pursue the References for more complete descriptions. The case study in Appendix C, together with the brief manual in Appendix D, provides some direction on how one might go about producing a string index by means of the NEPHIS system.

The middle part of the book has been organized around the basic processes of string index production: input of data, with more or less human intervention and with various kinds of assistance to the indexer; generation of index strings from the input according to the syntactic rules; generation of other index elements, such as cross-references; sorting of the index elements; and, finally, display of the resulting index. The last chapters will deal with: selection and evaluation, designing an original string indexing system, a specific case of using a string indexing system to create a printed index, and the future of string indexing.

All this discussion will require a certain amount of technical jargon. New terms will be explained in the main text as they come up. Jargon terms adopted in this book will be underlined the first time that they appear; other jargon terms will appear in quotation marks. The Glossary will also provide definitions of terms used repeatedly.

A summary will follow each of Chapters 1 through 8. Each summary is meant as a surrogate for the corresponding chapter and simply repeats some of the information given in the chapter itself. Readers are not advised to read the summaries as though they were parts of the main text.

Even systems of purely historical interest will be described in the present tense. One purpose of this procedure is to circumvent in part the problem of absolute currency of information. On a more philosophical level, all documented systems continue to exist in their documentation, some, like Systematic Indexing, so well that they could in theory be reactivated at any time.

 

Since 1974, I have been working in the area of generating index displays by computer. Initially, much of this work was on NEPHIS. More recently, I have extended my research to ways of generating various kinds of index display from a database with a network structure. Writing this book has given me an opportunity to step back and view my own work in the context of what others have accomplished.

The writing was supported in part by Operating Grant A3027 of the Natural Sciences and Engineering Research Council of Canada and by a Leave Fellowship from the Social Sciences and Humanities Research Council of Canada.

I have received valuable comments and advice from a number of people, including: my former research assistant Luc Declerck; students in course 569, 1983 Spring, "Information Retrieval and Document Analysis", at the School of Library and Information Science, The University of Western Ontario; students in course 211, 1985 Winter, "Subject Access", at the Graduate {xi} School of Library and Information Science, the University of California, Los Angeles; my colleagues Elaine Svenonius and Harold Borko; and my wife, Joan Craven.

Thanks are also due to James D. Anderson and to Eileen Mackesy and the Modern Language Association for additional information on CIFT; to Jack Cain of UTLAS and Derek Austin of the British Library for additional information on PRECIS; to Barbara Booth for information on the NILS system; and to E. Michael Keen for the TOPSI-UNIV software.

  Contents Chapter 1: Introduction -->