Subject Analysis and Thesaurus Construction.


Tim Craven, NCB 203, tel. 519-661-2111 ext. 88497.

Calendar description

Expressing what a document is about by selecting a suitable set of index terms or summarizing in an abstract. The structure of indexing languages; the use of controlled vocabulary in print and online sources, and principles of thesaurus construction. Practice in indexing, abstracting, and thesaurus construction.

Course documentation

General documentation available to everyone will be posted on the course's public Web site


7 short reports of your choice from among the 12 weekly assignments listed @ 9% each 63%
Preliminary facet question for thesaurus (due March 12) 1%
Final thesaurus construction assignment 18%
Participation in online discussion lists 18%
Total 100%

Grades on the traditional 15-point scale (-2 to 12, or F to A+) will be assigned to each of the components of course work worth more than 1% (a pass/fail type grade will be assigned to the preliminary facet question): B- (7) for satisfactory work; grades in the range of B (8) to A+ (12) for various degrees above the satisfactory level; grades in the range F (-2) to C+ (6) for various degrees below the satisfactory level.

For the final mark, an appropriately weighted average the grades will be computed, and the result will then be translated into a percentage to meet the requirements of the Faculty of Graduate Studies.

Short reports

The usual short report will not be intended to be a summary of everything that you have read or found out about the week's topic. Instead, it will lay the ground for online discussion and provide a selective sample of your work for evaluation and feedback purposes. What is actually expected is indicated in the "Weekly assignments and preparation" file [677ass.htm], together with any additional special files. Many assignments will emphasize thinking and problem solving rather than the gathering of information from secondary sources. Problems posed will not always have single correct solutions, though some approaches will be more correct than others.

The assignment will form the basis for evaluating each report. Criteria used in evaluating all reports will be correctness, clarity of thought and presentation, and coverage of all parts of the assignment. Where the assignment requests a more extended discussion, weight will also be given to the number of significant points made within the length indicated. You are reminded that intentional duplication of another student's work constitutes a violation of the policy on plagiarism.

"Plagiarism: Students must write their essays and assignments in their own words. Whenever students take an idea or a passage from another author, they must acknowledge their debt both by using quotation marks where appropriate and by proper referencing such as footnotes or citations. Plagiarism is a major academic offence (see Scholastic Offence Policy in the Western Academic Calendar)."

Reports should normally be e-mailed to the instructor, following the guidelines given at

Short reports will be evaluated using a marking scheme related to the wording of the particular assignment.

Confidential feedback will be provided by e-mail on each report. This will always include the grade assigned. Additional feedback will usually take the form of suggestions for improvement where applicable.

Thesaurus construction project

For this project, you are to create a controlled vocabulary in the form of a microthesaurus (giving 50 to 75 preferred terms) for one facet of the subject matter of newspaper articles such as those appearing in Western News. Details are provided in the files 677the.htm and 677theqa.htm. A preliminary wording of the facet question is due November 8.

The full project report is due in the last week of the course (March 9).


Instructor to students. In addition to the confidential feedback on each report, the course discussion boards will provide considerable general feedback. Additional individual feedback is available on request.

Students to professor. E-mail or telephone me directly. If you are on campus, you may also drop by my office during my office hours (about 8:45 am to 11:45 am, Monday to Friday, subject to other commitments and to weather).


  1. January 9. Introduction. What are indexes? How are they related to information retrieval and library cataloguing?
  2. January 16. Discussion of your efforts at book indexing. Types of decision in indexing. Principles of indexing. Special features of book indexing.
  3. January 23. Illustrating some basic terms in indexing and thesaurus construction. Reading for indexing.
  4. January 30. Experiences with indexing using the ERIC thesaurus. Sources of terms for thesaurus construction.
  5. February 6. Usefulness of automatic term extraction for indexing and thesaurus construction. Introduction to faceting.
  6. February 13. Discussion of facet analysis.
  7. February 20. Discussion of term-standardization exercise.
  8. [Research Week.]
  9. March 5. Semantic relations exercise. Discussion of TheW32 and other thesaurus construction software.
  10. March 12. (Preliminary facet questions due.) Experiences searching with controlled and uncontrolled vocabulary.
  11. March 19. Web indexes produced using XRefHT32.
  12. March 26. What makes a good index entry? Discussion of SKY Index.
  13. April 2. Discussion of abstracting according to the ANSI-NISO standard.
  14. April 9. (Thesaurus construction project reports due.) Features of existing indexing and abstracting products.


