{ 163}

CHAPTER 8
SELECTION AND EVALUATION

This chapter is aimed more specifically at practitioners. Its chief purpose is to present some ideas for those interested in setting up a string indexing system in a specific environment. The first section will deal with the types of information useful in selecting an existing string indexing system; the second will suggest one approach to analyzing information in order to select a system; the third will present two simple approaches to designing one's own string indexing system; and the fourth will give some information on comparative studies of string indexing systems.

Space is not available here for exhaustive guides to selection, design, implementation, and evaluation. Instead, it is left to individual selectors, designers, implementers, and evaluators, employing the information given in this book and in other sources, to decide exactly how they should procede with their tasks.

8.1 INFORMATION USEFUL IN SELECTION

In selecting or designing a system of any kind, one needs to consider how well it fits into the environment for which it is being selected or designed. Thus, this section will consider those characteristics of the string indexing environment and those features of existing string indexing systems that may influence selection of a string indexing system.

8.1.1 The environment

The environment can be divided into three major parts: the collection, or what is to be indexed; the searchers, or who are to be using the index to find information in the collection; and the resources, or who and what is available { 164} to produce the index. None of these parts is necessarily restricted to the boundaries of the selecting organization. Even the resources used may be entirely or almost entirely external, involving cooperative efforts or the letting of contracts.
8.1.1.1 The collection
The main things to know about the collection are: the kinds of subjects with which it deals, especially how complex and how varied these subjects are; how large it is; how much retrospective indexing is required and how much ongoing; the format and language of the items to be indexed; the amount of systematic overlap among the indexed items; and the access tools already provided.

Subject complexity here means the complexity of the descriptions required to index the subject; that is, how many terms are needed, how many of these are access terms, how many links have to be indicated between terms, and how many different kinds of links must be distinguished. If subjects can generally be specified by single terms, string indexing will probably be a waste of time. Great subject complexity, on the other hand, may allow only certain kinds of string indexing systems; namely, ones which permit indexers to construct complex descriptions easily and consistently and which display these descriptions in a form readily understandable to searchers.  Such string indexing systems may be difficult to find. PRECIS, for example, seems to demand undue effort from indexers in dealing with extremely detailed subjects like those of articles in some scientific journals.

Subject variety suggests the need for a system which can be applied easily to subjects with different structures and from different disciplines. A system like CIFT, with a fixed structure of questions oriented toward a given discipline, will be at a disadvantage in a multidisciplinary collection: a different set of questions may have to be devised for each major discipline represented. On the other hand, such a system may be ideal for a specialized collection of documents in one subject area.

If the typical indexed item deals with a number of overlapping subjects, all important to retrieval, a system such as CASIN, which facilitates the division of an input string into multiple "themes", might be favored.

The kinds of subjects dealt with by a collection can be estimated by having a trained subject analyst describe sample items. Random sampling will be useful for some purposes. On the other hand, it may actually be desirable deliberately to seek out unusual subjects within a collection, in order to be sure that the string indexing system selected will not fail on certain difficult points. In any case, regardless of the sample, it is difficult to avoid biasing the analysis toward a system or systems familiar to the analyst. { 165}

Collection size is important because only large collections with fairly similar subjects really need the sophistication of citation order available from a system like NEPHIS or PRECIS. A smaller collection will be quite adequately served by index entries consisting of an lead term plus a standardized description, as in many libarary catalogs.

For an entirely retrospective job, it may be possible to hire a trained indexer for a given time, and a complex system like PRECIS can thus be considered. On the other hand, indexing may have to be done in small pieces over a long time, but fairly promptly, by people who normally work at other tasks; a system which is easier to learn and easier to remember is then probably better.

The formats of the collection can be important in their variety or in their accessibility. For a multi-media collection, PRECIS' specific provision for indicating physical form might be considered a plus. Relatively inaccessible formats may call for more detailed description by indexers, because the indexed items are harder for searchers to select by direct examination. For instance, films are relatively difficult for searchers to examine directly in comparison to printed materials. Items retrievable in full via a computerized system call strongly for online index displays.

Both format and language can affect how readily indexers can create good input strings. A scientific article including a descriptive title, an abstract, and keywords and written in a standard terminology provides the indexer with good sources for input string construction. An item in another format, such as a photograph, or using less controlled language, supplies fewer starting points; and more reliance may have to be placed on authority lists and other indexer aids.

Two indexed items overlap when they contain some of the same material. Overlap is especially noticeable when one indexed item is actually part of another. In book indexing, an indexed item may be a chapter, a section within this chapter, a paragraph within this section, or a sentence within this paragraph. Book indexing therefore supplies many examples of extensive overlap between indexed items.

A great deal of systematic overlap, especially where the collection is readily accessible, generally points away from any string indexing system; a system such as chain indexing seems more appropriate. Thus, most books are unlikely ever to have string indexes. Somewhat less systematic overlap may still mean that a surprisingly simple string indexing system is suitable.

Access tools already available are important whether or not they are to be retained. If the string indexing system is to replace existing tools, it should be capable of achieving the same objectives. If the string indexing system is to complement other tools, less may be required of it than if the other access tools did not exist. Nevertheless, integration of the string indexing system with other forms of access will have to be considered. For example, { 166} the selector may wish the string indexing system to be employed as part of an existing database management system.

8.1.1.2 The Searchers
Information needed about searchers includes: what they like in index displays; what kinds of searches they will be doing; how much experience they will have; how quickly they need information; and what their time is worth.

Knowing the form of index that searchers actually prefer may be important. A form of index such as KWIC can meet with extremes of negative and positive reactions. The selector should not assume that searchers always want the most efficient type of index: searchers' preferences have, in fact, been shown not necessarily to coincide with efficiency for searching (Hartley and others 1973).

If searchers do not want a great deal of detail in index strings, it may be because they are generally looking for large amounts of material on broad subjects; that is, for high recall rather than high precision in searching.  With a few exceptions, such as PERMUTERM, string indexing systems favor high-precision searches. A system like POPSI, however, which provides direct access to index entries under broad as well as specific terms will also help in obtaining high recall quickly.

Searchers may be the eventual users of the indexed information or they may be professional staff to whom these end-users delegate searching. Whoever does the searching may be a frequent or an infrequent user of the index. A frequent user has time to become used to special symbols and other conventions; for example, to the special symbols in POPSI, or to the often multi-line headings in PRECIS. An infrequent user may find obstacles here. Experience with other indexes produced by means of a given system can be a help, however; this factor may favor a relatively widely used system like PRECIS. Even if use of an unfamiliar style of index is expected to be frequent, the selector also needs to ask whether time or inclination for training will be available. Searchers may need to get at information quickly even the first time that they used the index.

The need to get at current information quickly may mean that indexing has to be completed quickly. Thus a system like KWOC which provides prompt indexing at the expense of some other desirable features may be a good choice.

There may be distinct groups of searchers with quite different needs for searching; it may then be desirable for the system to be capable of producing several different types of index display from a single database. An extreme example of varied searcher needs occurs when different searchers want access { 167} through different languages such as English and French. Variety in searcher needs may influence the choice towards online index display systems, which are more readily customized.

Searchers' time may be relatively expensive in comparison to other costs, such as indexers' time; the selector then needs to put more emphasis on output quality, perhaps at the expense of more difficult and time-consuming input or processing. For example, many entries under lead-only terms, as in POPSI, may be chosen in preference to cross-referencing.

Some information about searchers may be known informally to the selector. In many cases, however, a more formal survey may be required in the form of interviews or questionnaires. Sample index displays may need to be produced to determine preferences. Compilation of a representative list of search topics, possibly with lists of items considered relevant, may also be needed.

8.1.1.3 Resources
Resources include funding, personnel, hardware, and sources of information.

Under funding, one should consider not only how much money is availiable, but also how much is available for startup and how much for ongoing expenses and any conditions attached to expenditure. For example, a grant may be available for a six-month period only or may be restricted to the purchase of certain types of equipment or to the hiring of certain types of personnel.

Indexers are the main consideration under personnel. The organization may already have indexers with sufficient experience to produce input of the quality needed. For example, while PRECIS indexing requires considerable training, a number of courses and workshops in this system have been offered; thus someone who knows it well enough to begin work immediately may be on hand. Alternatively, a free-lance indexer or an organization involved in indexing may be willing to work on a contract basis.

Existing hardware which is not being used to capacity may influence the choice toward a specific system. For example, an organization with a largely unused TRS80 Model 3 and a small string indexing task would need to take an especially close look at PERMDEX.

Information resources include lists of subject headings and existing records of the collection. Adapting a subject heading list to string indexing may mean that the string indexing system does not need its own controlled vocabulary. The Iowa State University system, say, may then be considered more suitable than PRECIS, in spite of PRECIS' large authority file.

Records of titles and other bibliographic data may already be in computer-{ 168}readable form. This could steer the choice towards a title-based system or a system based on the manipulation of established subject headings like those of the Library of Congress. String indexing input may in some cases already exist for many of the items in the collection for which an index is to be created: PRECIS input, for example, if they are books published in the United Kingdom; CIFT input, if they are recent materials relating to modern languages, literature, or folklore; or CASIN input, for materials relating to food technology.

8.1.2 The system

Someone considering a string indexing system should look at: its general characteristics; the kind of input that the software requires; the kind of database that it produces; the software and hardware required to operate it; and, finally, the output, in the form of index displays, that is to be expected.
8.1.2.1 General characteristics
General characteristics to consider in any system are: documentation; proof of performance; and costs.

Ideally, full documentation should be available for all aspects of the system: not only a complete indexer's manual, but also complete program documentation including program listings and instructions for operation and recovery from errors. Such documentation, if obtained in advance, is also a valuable source of information on many other characteristics of the system. Quality and extent of documentation are likely to vary greatly from one string indexing system to another, however, making the task of comparison difficult.

Systems such as PRECIS, CIFT, and CASIN that produce ongoing published indexes have a definite advantage in proving performance, since it is easy to tell that they are operational. Their ongoing use also provides cost figures, albeit for a different environment; for example, in 1975, the cost per indexed item for PRECIS at the British Library was estimated at about £0.34 (British Library Working Party on Classification and Indexing 1975, p. 17)

8.1.2.2 Input
The main quality to look for in input is ease for the indexer.
     Ease for the indexer may be measured in a number of ways: by the amount of input required from the indexer; by the amount of coding the indexer must include in the input; by the time a trained indexer takes to index an average { 169} item; by the time required to train an indexer; by the proportion of indexer input free from error; by the frequency of people who can be trained as indexers.

Amount of coding, in the sense of the number of coding symbols, is a relatively minor measure of difficulty for a trained indexer. For example, time required for coding PRECIS input has been estimated at only 10% of total input string writing time (Austin 1977).

Indexing time is mostly a function of the amount of analysis required on the indexer's part. A title-based system may not require an indexer at all, or indexer intervention may be required only for difficult titles. By contrast, a system which requires the indexer to examine the indexed item as a whole and to ignore the wording of the title requires much more analysis effort. Thus, one indexer using the ASI system needed only 83 minutes to edit 131 titles into 145 input strings (Armitage and others 1970); on the other hand, a trained PRECIS indexer should complete an average of 27.5 items in a working day of about 428 minutes (Austin, personal communication, 1985).

The availability of indexer aids, such as worksheets and various kinds of feedback from the software, may contribute to ease of input. A system with a built-in rigidity imposed by manuals and forms may be favored in order to standardize the input without the need to devise additional rules. On the other hand, built-in guidance may not produce the kind of results wanted for a particular application. In such cases, a relatively loose system like NEPHIS, for which the selector can devise additional rules, may be preferred. As an alternative, the rules of an existing system may be modified. For example, in one application (Robinson 1977), the definitions of PRECIS' main role codes were stretched in order to accommodate indexed items on special subjects.

8.1.2.3 The database
Questions to ask about the database created by the string indexing system should cover: its detail; vocabulary control; and accessibility.

Many string indexing systems will in theory support virtually as detailed descriptions as desired, but there may be practical limitations. A system like NETPAD, for example, is presently restricted by the limitations on string length of the programming language in which most of the software is written. A number of systems, including PRECIS, may have another difficulty with exceedingly complex descriptions: the descriptions can be stored, but they are very time-consuming for indexers to construct.

Some string indexing systems allow more detail about linktypes than do others. If linktypes are not important, on the other hand, a system like the Iowa State University system, which virtually ignores the distinctions between different types of link, may be nearer to what is needed. { 170}

Vocabulary control includes term standardization and provisions for cross-references. Terms may be standardized by having indexers refer to an existing list of terms, like the PRECIS authority list. Automatic term standardization, as in CIFT, may also be desirable.

Accessibility includes both accessibility for searching and accessibility for additions and changes. One way in which the data in the database will be more accessible for searching is if means other than a string index can be used when desired; in other words, if the input can do multiple duty for different kinds of retrieval. One possibility is for the database to be compatible with standard formats, so that it can be made use of by other software. For example, CIFT is designed to produce a database which can be searched by means of a conventional Boolean retrieval system such as Dialog. CIFT distinguishes different aspects of the description by putting them in different fields. Thus, a searcher using a system like Dialog could easily restrict a search to terms appearing in a particular part of the description, such as "themes/motifs/figures".

A different approach to accessibility for searching is for the string indexing system software itself to allow different kinds of searching. The NETPAD software, for example, supports searching by Boolean combinations and by substructure matching.

A third possibility is for the string indexing software to be easy to embed within an existing retrieval system. This approach is especially likely when the computer routines are very easy to program, as they are for KWOC, PASI, or the NILS system.

When additions and changes are required, specialized built-in editing software, as in NETPAD, may be an asset. On the other hand, a system like ASI, where an addition or change relating to one item can affect the index strings of other items, is relatively inhospitable.

8.1.2.4 The software
Software should be available, adaptable, and reliable.

A publicly-supported proselytizing system like PRECIS has an interest in making its software widely available; a system designed for producing a specific commercial product may not. Interestingly, however, the British Library does not make its PRECIS software available to other agencies (Austin and Dykstra 1984, p. 305); and the programs for PERMUTERM, while remaining proprietary information, have been supplied by ISI to organizations willing to pay the charge (Garfield 1972).

If the software is written in a commonly-used high-level computer language in a way that avoids the peculiarities of specific machines, it is more likely to be adaptable. LIPHIS software, for example, is available in COBOL, one of the most standardized high-level languages. { 171}

The less software there is, the more easily it can be adapted. A relatively simple system like NEPHIS or KWOC is easy to adopt for this reason.

If the software has been used extensively, then there is some evidence of reliability. In any case, it should be tested by the selector. Here, it should be borne in mind that simple systems are easier to test for reliability than more complex ones.

8.1.2.5 The hardware
Hardware requirements relate to input, to processing, to storage, and to output.

Basic input hardware requirements, namely, for a standard computer terminal, are not likely to vary much from one string indexing system to another. Certain indexer aids, such as graphic displays, will, however, assume a video display terminal rather than a printing terminal.

Software for more complex systems generally requires more powerful hardware for processing; especially, more computer memory. Even very complex software like that for PRECIS can be adapted to run on a microcomputer; but processing time required will increase, perhaps beyond all reasonable bounds. Sorting is generally the process which requires the most hardware power if it is to be carried out in a reasonable time.

Some systems, like PRECIS, require space to store a machine-readable thesaurus in random-access form, so that cross-references can be generated. POPSI, on the other hand, has no need for randomly-accessible storage of a thesaurus, since its few cross-references, all of which are "see" references, are generated from the input strings.

Minimal hardware requirements for output are likely to be similar for most systems.

8.1.2.6 Output
In general, output should be index displays that can be searched efficiently and effectively. The selector should consider output flexibility, if a single form of index display will not suit all searches, and compatibility with other searching methods, if not all searching is to be done via the index display.

Consistency is one factor contributing to search effectiveness and efficiency. Generally, systems which closely control the input tend to have more consistent output: rules and other guidance limit indexers' choices. In more complex systems, searchers may imagine inconsistencies when subtle distinctions are made with which they are unfamiliar; apparent consistency may thus be easier to achieve with a relatively simple system.

Layout and typography also contribute to search performance. A system like NEPHIS has no built-in provisions for layout and typography: these { 172} must be added by means of additional software. PRECIS, on the other hand, has very specific provisions. Where the provisions are built in, one needs consider whether these provisions are suited to the anticipated need. For example, do the long, often multi-line headings of PRECIS index entries serve the searchers best? Where the provisions are not built in, consideration needs to be given to the costs of software to produce a finished product in the best form.

Search performance may depend on the experience of the searcher. For example, an inexperienced searcher will likely be hampered by the abbreviation of role indicators in MULTITERM index strings; on the other hand, frequent use may make these abbreviations more readily grasped than the fuller forms. Usually, the amount of experience required for searching a string index is relatively small. Thus, the PRECIS display format may present some difficulties at first, but should not take too long to understand.

The selector may wish to produce several specialized index displays from a single database. At present, this kind of flexibility is mostly either experimental, as in NETPAD, or very limited, as in Double-KWIC. It is more likely to be achieved, however, by adapting a system like CIFT than by adapting some other systems: the parts of CIFT input strings are relatively independent and relatively well defined and could be omitted and rearranged in many combinations.

An extreme case of varied output occurs when index displays using more than one ordinary language such as English and French are required. Writers on PRECIS have addressed this need (Sørensen 1977; Sørensen and Austin 1976b; Verdier 1979; Verdier and Austin 1977), and the PRECIS/Translingual Project has provided some solutions to problems in translating among English, French, and German (Matter 1979). MULTITERM shows features of a kind of string indexing system which allows the simplest translation from one language to another: it connects terms in index strings with punctuation or other arbitrary symbols and does not treat adjectives as separate terms. Arbitary symbols either do not need to be translated or are easily translated, and eschewing adjectival terms avoids problems with concord and position.

One way in which a string index display may be compatible with other searching methods is by using the same terms. With this kind of compatibility, a searcher might employ a string index display for initial browsing or scannning and then use the terms found in the relevant index entries in constructing a Boolean query for systematic current-awareness searching; a reverse situation, in which Boolean searching is followed by browsing or scanning of an index display, is equally conceivable.

If online index displays are desired, the kind of response time to be expected { 173} needs to be examined; that is, how long it generally takes the software to respond with a screenful of index elements. If the index strings have to be generated as part of this response, systems with complex rules for determining the order of terms in each index entry may be expected to perform less well. Response time will also depend heavily on the type of hardware used and on the level of language in which the software is written.

8.2 COMPARING CHOICES

When more than one choice remains possible and further information-gathering seems unproductive, the selector may wish to analyze the information already gathered and weigh the choices. This section presents a common and effective technique for weighing choices. The main elements of this technique are: 1. construction of a table of choices and features; 2. the rating of each choice on each feature; 3. the weighting of the features; 4. the multiplication of the ratings by the corresponding feature weights; and 5. the summing of the resulting products to calculate an overall value for each choice.

Ideally, all features that are relevant to the choice and on which some information is available should be considered. For a detailed comparison of string indexing systems, the number of such features could be extremely large. For the sake of a simple illustration, however, suppose that the list of features consists only of the eight fairly general items:

  1. published documentation
  2. input string coding
  3. input string length
  4. worksheets
  5. input string detail
  6. index string detail
  7. index string eliminability
  8. index string clarity
Suppose further that seven string indexing systems are in contention: PERMUTERM, ABC-Spindex, PRECIS, POPSI, CIFT, CASIN, and NEPHIS.

In general, the ratings can be assigned in a variety of ways, ranging from extreme objectivity to arbitrary guesses. A precise money amount may sometimes be possible as the rating for a feature such as "cost of software". For other features, the selector may enter a number representing a subjective judgment or summarizing a large number of factors. Table 8.1 shows { 174} a conceivable set of ratings for the seven string indexing systems on the eight sample features. None of these ratings is purely objective, and all might vary depending on the environment and the rater. For the sake of simplicity, all ratings are whole numbers in the range 0 to 9. A 0 indicates absence of a feature; a 9, the highest degree of presence of a feature.

Table 8.1: Unweighted Ratings of Seven String Indexing Systems on Eight Sample Features
PERMUTERM ABC-SPINDEX PRECIS POPSI CIFT CASIN NEPHIS
1. published documentation 2286265
2. input string coding 0192443
3. input string length 4378685
4. worksheets 0070970
5. input string detail 7587798
6. index string detail 2576958
7. index string eliminability 2384477
8. index string clarity 1175678

Feature weights tend to be subjective. They should be determined on the basis of what the selector thinks is most valuable for the environment. If two features seem equally valuable and if their ratings have similar ranges, the features should be given equal weights. If a feature is not desirable, it should be given a negative weight. The eight sample features might, say, be given the following weights:
1. published documentation  1
2. input string coding -1
3. input string length -1
4. worksheets 1
5. input string detail 2
6. index string detail 1
7. index string eliminability 1
8. index string clarity 3
Applying these weights to the ratings given above would yield Table 8.2.

Table 8.2: Weighted Ratings of Seven String Indexing Systems on Eight Sample Features
PERMUTERM ABC-SPINDEX PRECIS POPSI CIFT CASIN NEPHIS
1. published documentation 2286265
2. input string coding 0-1-9-2-4-4-3
3. input string length -4-3-7-8-6-8-5
4. worksheets 0070970
5. input string detail 14101614141816
6. index string detail 2576958
7. index string eliminability 2384477
8. index string clarity 332115182124

{ 175}

Finally, summing the weighted ratings for each choice gives the following overall values:
PERMUTERM 19
ABC-SPINDEX 19
PRECIS 51
POPSI 35
CIFT 46
CASIN 52
NEPHIS 52
Here, CASIN and NEPHIS have the highest overall value and hence rank first.  On this basis, either CASIN or NEPHIS should be chosen as the best system for the hypothetical application. These two system are first, however, by the narrowest of margins; and, as already pointed out, the ratings and weights are subjective and the number of features used for illustration very restricted. Faced with close results like this in real life, the selector might reasonably choose any of the runners-up (PRECIS, CIFT), eliminating only those choices clearly rated much lower than the winner (PERMUTERM, ABC-Spindex, POPSI).

8.3 DESIGNING A SIMPLE STRING INDEXING SYSTEM

After examining existing string indexing systems, it may be decided that none really fits the requirements of a given application, but that some sort of string indexing still seems desirable. This section will present two approaches { 176} to quickly designing and implementing a string indexing system for special applications. Many different software tools may, of course, be of use in developing string indexing systems. The two approaches presented, however, have been selected for their feasibility in a wide variety of environments.

8.3.1 The KWOC approach

Sophisticated permutation may not be needed; perhaps all that is needed is a simple KWOC-like permutation procedure, making each term in turn into the heading while the input string as a whole forms the subheading. KWOC software is common for many kinds of computer. Even if it is not available for a given computer, writing it is an elementary programming exercise. The designer's task is simply to design the form of the input.

Terms should be single words, or look like single words to the KWOC software. This requirement might involve concatenating words artificially. The concatenation might be direct, as in "shortstory" or "Englishlanguage", or with a binding character such as a hyphen, as in "Morales-JA" or "psychoanalytic-approach". The choice would depend on what the software recognizes as a boundary between words.

Generally, the beginnings of input strings should serve the purposes of eliminability and collocation. A good approach might be to adopt a simple classification scheme or a short list of standard terms representing major themes or subject areas of the items to be indexed; the beginnings of input strings could then be restricted to the few controlled terms of this scheme or short list. The ends of the input strings may be relatively less controlled in form and vocabulary.

Connectives should be recognizable as not being access terms. If the KWOC software includes a stoplist, connectives can be chosen from this list. An alternative is to avoid connective words in the input string in favour of punctuation marks or other characters which will be recognized as not being parts of words.

8.3.2 A simple DBMS approach

Database management systems (DBMS) of a variety of types are available for many different kinds of computer, including many microcomputers. A DBMS should not be expected to incorporate a string indexing system at the present time, but can be very useful in setting one up. Using a DBMS allows the system designer to call on a repertoire of common computer procedures such as those for sorting character strings and dividing printouts into pages.

DBaseII represents a fairly common type of DBMS. The user can define { 177} a number of files in which data can be stored. Each file consists of a number of records, and each record in a file consists of data elements. The user defines a set of fields for each file to specify the number and type of data elements in a record in the file. Commands can be given to the DBMS software to perform common types of data manipulation; for example, to modify the contents of a given record, or to sort the records in a file in accordance with a key value derived from the data in each record. Sequences of commands, including instructions for conditional and repeated execution of commands, can be created for later processing like programs in a programming language.

A string indexing system can be set up by means of a DBMS like DBaseII in a number of ways. The next few paragraphs outline a fairly simple approach, which also has some limitations.

A file is created to contain the input strings.  This file has a field for each term in the input string, plus a field for the locator. For example, the designer might want to describe a collection of articles on teaching by terms for the teaching method discussed, the subjects taught, the types of student, and the types of teacher. Thus, the fields defined might be:

METHOD
SUBJECT
STUDENTS
TEACHERS
LOCATOR
Given that the article whose locator is "0001" is about "the supervision of graduate students' thesis work in the humanities", the data elements for the corresponding record might be:
METHOD: thesis work
SUBJECT: humanities
STUDENTS: graduate students
TEACHERS: graduate advisers
LOCATOR: 0001
A second file is defined to contain the index entries. Here, the fields might be:
HEADING
SUBHEADING
LOCATOR

A command sequence is composed to instruct the DBMS software to add a set of index entries to the index entry file for each record in the input string file. In DBaseII, the "APPEND BLANK" command adds a new record to the end of a file. Initially, as the command implies, this new record is blank. A "REPLACE" command can be used to put a value into one of the positions in the record. { 178}

Suppose, for example, that one type of index entry to be created has a heading consisting of the term for the subject taught and a subheading consisting of the terms for method, type of student, and type of teacher, with the terms in the subheading being separated by colons.  To put the term for the subject taught into the heading, the command used is:

REPLACE HEADING WITH P.SUBJECT
(DBaseII allows two files to be dealt with at once. Here the assumption is that the input string file is the "primary" file and that the index entry file is the "secondary" file and has been "SELECTed" for modification. The "P." before "SUBJECT" indicates that the subject information is taken from the primary file rather than from the SELECTed file.) The appropriate subheading is obtained with the command
REPLACE SUBHEADING WITH P.METHOD - ":" - P.STUDENTS - ":" - P.TEACHERS
- the minus-sign ("-") being used in DBaseII to indicate that two strings of characters are to be strung together. Finally, the locator is copied with the command
REPLACE LOCATOR WITH P.LOCATOR
The result is a record in the index entry file such as:
HEADING: humanities
SUBHEADING: thesis work:graduate students:graduate supervisors
LOCATOR: 0001

A DBMS like DBaseII permits specification that certain commands are to be executed only under certain circumstances; thus, the designer can specify not only what types of index entries will appear in the index, but also under what conditions they will appear. For instance, index entries under subject taught might be restricted to those cases where a subject term is actually specified:

IF P.SUBJECT <> ""
    APPEND BLANK
    REPLACE HEADING WITH P.SUBJECT
    REPLACE SUBHEADING WITH P.METHOD - ":" - P.STUDENTS - ":" - P.TEACHERS
    REPLACE LOCATOR WITH P.LOCATOR
ENDIF

When the index entry file is complete, a sorted version can be created; { 179} for example, if the sorted file is to be called "STRIND", the sorting command might be

SORT ON HEADING + SUBHEADING TO STRIND
The "LIST" command in DBaseII can be used to print the sorted version, or a sequence of commands may be created to display the data in a better layout.

This simple DBMS approach has two main advantages over the KWOC approach. First, the designer has more control over the content and citation order of individual types of index string. Indeed, a number of different types of index display can readily be produced from the same database. Second, manipulation of the data for other purposes is generally easier; for example, a short sequence of DBMS-language commands could produce alphabetical listings of the terms used in the different fields for purposes of vocabulary control.

The main limitation of this DBMS approach is that a field in the input string file must normally be defined for every possible kind of term in a description. For example, if there is a possibility of a second teaching method's being mentioned in a description, a field for that second method must be defined, even if most descriptions do not name a second method:

METHOD1
METHOD2
SUBJECT
STUDENTS
TEACHERS
LOCATOR
Each new field takes up some additional space in the file, even if its position in a record is blank (in DBaseII each new field takes up space corresponding to its maximum length). At least as serious, any increase in the number of fields in the input string file leads to a corresponding increase in the length of the command sequence required to produce the index entries.

More sophisticated use of a DBMS for string indexing system design is usually possible, but will involve additional design effort.

8.4 EVALUATIVE STUDIES

Early evaluations of string indexing systems were generally concerned with comparing simple title-based string indexing with assigned subject headings (Yerkey 1973). More recent evaluative work has mostly concentrated on the { 180} PRECIS system. Relatively little comparison of similar actual string indexing systems has been carried out. Nevertheless, PRECIS, NEPHIS, and the Relational Indexing index string generator have all been illustrated for the same set of 91 articles from the Journal of the American Society for Information Science (Svenonius 1978). Analysis and index strings for these three systems plus POPSI have also been published for twelve selected subjects (Farradane 1977). In evaluating the more complex systems, emphasis has tended to be placed on quality of output rather than on ease of input; Bakewell (Bakewell 1979), however, did conduct a survey of opinions on PRECIS among indexers in libraries in the British Isles.

A number of serious problems confront researchers attempting to evaluate string indexing systems, or indeed any indexing system oriented towards index displays. One of these problems is what measure or measures of the quality of output to use. In proposing to compare several printed index services, Aitchison and others (Aitchison and others 1970, p. 2) already consider five measures: time to find each relevant item; percentage of the relevant items not found; percentage of the relevant items found; closeness of match between search terms and headings in the index; and ease of use. In reporting their findings (Aitchison and Hall 1973), they use about sixteen different measures.

Conaway and Schnurr (Conaway and Schnurr 1979) distinguish three major qualities of index displays: "utility", or the coverage relative to the searchers' needs; "usefulness", or completeness and accuracy of information, including depth, exhaustivity, appropriateness of access points, syndetic structure, and vocabulary control; and "usability", or a combination of effectiveness and efficiency. Conaway (Conaway 1975) suggests a general measure of "usability" proportional to the mean success rate divided by the square root of the mean search time. But there is little theoretical support for such a measure; and, as formulated by Conaway, the mean success rate can be determined only if one and only one indexed item is assumed to be relevant in each search.

The rest of this section will outline briefly four reported attempts at evaluating string indexing systems and will point out major conclusions and limitations of these attempts.

8.4.1 Campey's INDACS Evaluation

INDACS is a software package which includes several options for producing different kinds of string index: the software may recognize access terms by the presence of coding in the input strings or by reference to a stoplist or to a golist; and three forms of display, KWIC, KWOC, and Double-KWIC, are permitted. Campey (Campey 1974) compared the costs of these options and of different methods of constructing input strings; the collection used was 800 abstracts in library and information science from the Aberystwyth Index { 181} Languages Test. The main elements of cost considered were those of indexer time and of computer service bureau charges.

The average times to index an item according to various methods, beginning with the longest, were:
about 4 minutes 1. titles enriched, coded, and edited according toconsistent rules
2. titles enriched and edited according to consistent rules
about 2 1/2 minutes 3. titles enriched, coded, and edited without consistent rules
4. thesaurus terms with weights and joined by connectives
about 2 1/3 minutes 5. "PRECIS-like" strings without coding
about 2 minutes 6. lists of thesaurus terms
7. titles enriched and coded without consistent rules
about 1 3/4 minutes 8. titles enriched and edited without consistent rules
9. titles enriched without consistent rules
about 3/4 minute 10. titles coded and edited without consistent rules
11. titles edited without consistent rules
about 2/3 minute 12. titles coded without consistent rules
about 1/2 minute 13. titles without any modification
The only coding involved was the marking of access terms.

Campey notes that operations can be performed more efficiently as combined processes; thus, someone selecting an index system cannot simply add up the costs for individual processes such as enriching and coding to arrive at an overall figure for cost of indexing.

For determining access terms, except where a thesaurus was being maintained, costs for the golist method were significantly greater, while those for stoplisting and coding were about equal. With regard to service bureau charges, KWIC and KWOC were inexpensive and Double-KWIC substantially more expensive.

Obvious limitations of this study are: its restriction to a single collection of abstracts and to fairly simple string indexing systems; and its lack of any check on the quality of output.

8.4.2 EPSILON

The EPSILON (Evaluation of Printed Subject Indexes by Laboratory investigatiON) project (Keen 1977b, 1978) is probably the most important evaluation of string indexing output. With formating kept constant, and input strings more or less so, the EPSILON investigators tested chain procedure { 182} and six main types of index string: 1. lead term only; 2. lead term plus a list of all the terms in the input string with periods between the terms; 3. lead term plus a title-like phrase containing all the terms (like KWOC); 4. ASI-like index string; 5. PRECIS-like index string; 6. lead term with sometimes a second term as a subheading, as in H.W. Wilson Company indexes. Examples of the first five types are:
  1. Theft
  2. Theft
        Prevention. Theft. Books. Users. Sacramento State College Library. United States
  3. Theft
        Prevention of theft of books by users of Sacramento State College Library, United States
  4. Theft
        of books by users of Sacramento State College Library, United States, prevention of
  5. Theft. Books. Stock. Sacramento State College Library
        By users. Prevention

A few hundred items were chosen for indexing from the Information Science Index Languages Test. The resulting indexes were, however, believed to represent the generality of a collection of about 3,500 items; i.e., the average similarity between pairs of items from the test collection would be about the same as for pairs of items from a real collection of 3,500.

A variety of evaluation methods were used, including a questionnaire to determine searchers' preferences. Overall merit was, however, judged on the basis of a composite presentation involving cumulative numbers of relevant and irrelevant entries found with time. The intuitive ranking arrived at was: 1. ASI-like; 2. KWOC-like; 3. PRECIS-like; 4. lead term plus list of terms; 5. access-point term only; 6. chain procedure; 7. Wilson style.

As a result of the study, the EPSILON researchers suggest three criteria for designing printed index entries: 1. a complete description should be provided under each lead term; 2. connective words, such as prepositions, can be included "when convenient", but will not make much difference; 3. there appears to be no good reason to prefer a PRECIS-like or ASI-like permutation system over a simple KWOC-like system with prepositions, or vice-versa. Searcher preferences, however, seem to contradict the last two of these suggestions. In contrast to criterion 2, searchers in the EPSILON experiment preferred KWOC-like index string with prepositions to lead term plus list of terms 46% to 15%. In contrast to criterion 3, searchers preferred KWOC-like index string to ASI-like index string 46% to 15%. { 183}

The main limitations to note about EPSILON are its restriction to a single subject area and the small size of its database. The latter limitation especially, as Keen is aware, may have biased the results against the more sophisticated systems.

8.4.3 WUSCS

The Wollongong University Subject Catalogue Study (WUSCS) (Hunt 1978; Hunt and others 1976, 1977a, 1977b) also evaluated string indexing primarily from the point of view of output. Here 2073 documents were chosen from eight academic disciplines. Types of index string used for experimental searches were: PRECIS on cards, with and without cross-references; PRECIS in book form; Library of Congress Subject Headings, with and without cross-references; a local adaptation of Library of Congress subject headings; simple titles; and KWOC on enriched titles.

On the input side, the WUSCS investigators feel that: 1. full PRECIS indexing was just as quick to do as using some simplified form; and 2. PRECIS indexing took no longer than indexing by Library of Congress subject headings.

From the specific search results, WUSCS concludes that: 1. PRECIS index strings and Library of Congress subject headings performed equally well for searching; 2. PRECIS performed equally well whether the catalog was on cards or in book form; 3. a single lead term plus an abstract performed at least as well as a standard PRECIS index string.

Overall, the preferences expressed by searchers gave no clearly defined first choice. The greatest number of searchers, however, regarded KWOC as the quickest and the most liked and the catalogs with full cross-references as the most effective; a slight plurality recommended the PRECIS catalog with full cross-references for an operational system. Moreover, the researchers gained the impression that individual searchers had definite preferences for particular catalogs; and searchers were found to favour strongly index strings with more information about the items indexed and a physical format that allowed scanning of blocks of index elements. The WUSCS researchers also conclude that "Effective retrieval is a necessary but not sufficient criterion for assessing user acceptability of a subject catalogue". Specifically, they note that: 1. searchers want the vocabulary of the index to be controlled regardless of effectiveness; and 2. searchers are more confident in the results of their searches if the index strings are more detailed and are properly presented.

Two of the three major limitations noted by the WUSCS researchers seem especially important: 1. the relatively small size of the database; 2. the lack of time for searchers to learn how to use different systems. As in the case of EPSILON, the small size of the database, combined in WUSCS with { 184} the variety of disciplines, tends to bias the results against the more sophisticated systems. A number of other, less important problems with WUSCS are mentioned in the report.

8.4.3 Jamieson's evaluation

Jamieson's study (Jamieson 1981) differs in some notable ways from EPSILON and WUSCS: only three queries, specially constructed to emphasize the types of links between terms, were used; instead of displays of an entire index, two-page simulated displays of a much larger index were searched; while basically derived from a real index display, the test index displays were enriched with invented index entries, to make searching more difficult, and with additional relevant entries, to bring the number of relevant entries up to five per query; sequences of entries with identical lead terms were sorted into a random order that was kept constant for all forms of index display, to ensure similar positioning of target locators regardless of the form of index string; and the heading was displayed separately for each index string, to emphasize clarity over other index string qualities.

The six types of index string tested in Jamieson's study were designed to vary in citation order and in the indication of linktypes by prepositions versus their nonindication by periods; e.g.,

  1. TEACHERS. Academic achievement. Accountability. Schools.  Students. Teachers
  2. TEACHERS. Schools
        Accountability. Academic achievement. Students
  3. TEACHERS. Schools
        Accountability for academic achievement of students
  4. TEACHERS in schools
        Accountability for academic achievement of students
  5. TEACHERS. Schools. Students. Academic achievement.  Accountability of teachers
  6. TEACHERS. Teachers' accountability for the academic achievement of students in schools

Measures applied to the different systems included: searchers' perception of ease of use; search time; and (for two different levels of relevance judgment) recall, recall divided by search time, irrelevant entries rejected divided by number of irrelevant entries in the index display { 185} ("discrimination"), and discrimination divided by search time. As a result of detailed analysis of results from 120 searches on each of the three queries, Jamieson, with caveats, makes five recommendations:

  1. "index entries should have a meaningful sequence of terms" (the study does not suggest that one meaningful sequence is better than another);
  2. "this sequence probably should be segmented into smaller phrase units, rather than be presented as one long continuous phrase";
  3. linktype-indicative connectives "should be used if possible, particularly if term relationships are not obvious";
  4. "long line-lengths should not be used";
  5. "typographic coding should be considered as an aid for the easy identification of important terms".

If EPSILON and WUSCS were biased against the more sophisticated systems, Jamieson's study appears to be slanted in their favor, both by the design of the index displays and by the choice of queries.

The main limitations of Jamieson's work are: the use of only three queries, which are treated as separate case studies; and the slightly artificial design of the index displays used.

Chapter 8 Summary

The selector or designer of a string indexing system needs to consider how well it fits into the environment for which it is being selected or designed. Aspects of the environment to be considered are the collection to be indexed, the searchers, and the resources available to produce the index. The main things to know about the collection are: the kinds of subjects with which it deals, its size, its rate of growth, its language and formats, the amount of systematic overlap among indexed items, and the access tools already provided. Information needed on searchers includes: their preferences in index displays, the kinds of searches that they will be doing, their experience, the speed with which they need information, and the cost of their time. Resources to be determined include funding, personnel, hardware, and information sources.

In selecting among string indexing systems, one should look at general characteristics, input requirements, the kind of database produced, software and hardware, and the kind of output to be expected. General characteristics include documentation, proof of performance, and costs. The main quality to look for in input is ease for the indexer. Questions to ask about the database created should cover its specificity, vocabulary control, and its accessibility { 186} for both searching and modification. Software should be available, adaptable, and reliable. Processing, especially sorting, is the key consideration in determining the power required of the hardware. Output should be index displays that can be searched efficiently and effectively; output flexibility and compatibility with other searching methods may be important.

A common and effective technique for weighing choices involves construction of a table of choices and features, the rating of each choice on each feature, and the weighting of the features.

Two simple approaches to designing one's own string indexing system make use of existing KWOC and DBMS software. In the KWOC approach, the designer's task is simply to design the form of the input. The simple DBMS approach can be illustrated for DBaseII; in this approach, the designer has considerable control over the types of index strings and can easily manipulate the database for other purposes, but storage and design costs are greater.

A number of evaluations of string indexing systems have been carried out. A major question confronting researchers is what evaluative measure or measures to employ. Four important studies are Campey's INDACS evaluation, Keen's EPSILON project, the Wollongong University Subject Catalogue Study, and Jamieson's study. Campey compared indexing time and processing costs. EPSILON and WUSCS looked mainly at search efficiency and effectiveness and searchers' preferences for indexes to small sample databases. Jamieson experimented with three contrived queries and simulated index displays based on the index to a large database. Important conclusions of these studies are: that time to perform different indexing operations is not additive; that, from the searchers' point of view, detail should not be sacrificed; and that searchers have confidence in controlled, well-formated index displays. No one system is clearly best.

<-- Chapter 7: Cross-references, Sorting, and Formating Contents Chapter 9: The Future of String Indexing -->