LIS 677 - Faceting introduction

First, this page will list reasons why faceting might be useful and some guidelines ("canons") for defining good facets. Next, it will give a brief example of applying faceting to a set of simple document descriptions. Finally, it will tie individual reasons and guidelines to the example.

Some reasons for a faceted approach

Canons for characteristics

(Applicable to facets.)

A sample of facet analysis

Descriptions

  1. Transportation of grapefruit to Canada from Mexico
  2. Marketing of fish from Canada in the USA
  3. Pricing of shrimp from Bangladesh
  4. Retailing of fresh vegetables in Canada
  5. Wholesale pricing in the USA
  6. Exports from the USA to Canada

Facets

Mnemonic Brief question
[Commodity] With what specific commodity or commodities is the item chiefly concerned?
{Process} What process or processes undergone by the specific commodity or by commodities in general are dealt with in the item?
/Source\ From what specific country does the item indicate a commodity or commodities as originating, especially before undergoing any named processes?
(Destination) In what specific country does the item indicate a commodity or commodities as ending up, especially after undergoing any named processes?

Descriptions with facet coding

  1. {Transportation} of [grapefruit] to (Canada) from /Mexico\
  2. {Marketing} of [fish] from /Canada\ in the (USA)
  3. {Pricing} of [shrimp] from /Bangladesh\
  4. {Retailing} of [fresh vegetables] in (Canada)
  5. {Wholesale pricing} in the (USA)
  6. {Exports} from the /USA\ to (Canada)
For a longer example, see the file 677facex.htm.

Illustrating reasons for a faceted approach

Guidelines on what information is included or excluded from item descriptions

The set of facets reminds indexers to look for commodities, processes applied to commodities, and geographical sources and destinations of commodities in all the documents that they index. Conversely, indexers are allowed to ignore other questions about the subject matter, such as intermediate geographical locations; for example, if an item discusses transportation of goods from Mexico to Canada through the United States, the indexer can ignore the concept "United States".

A basis for rules for connectives and citation order for precoordination and display

For example, a rule could be made in this case to cite all subjects in the form
Commodity from Source - Process - Destination
This would give standardized compound subject headings or subject descriptions such as

Increasing precision in postcoordinate searching

If "source" terms are distinguished from "destination" terms in a retrieval system (by means of role operators or subheadings, or by assigning different facets to different subfields), searchers can then specify, for instance, that they want "Canada" only as a destination, thus excluding items, like #2, where Canada is the source.

An aid in thesaurus construction

This is the most important reason for looking at faceting in this course. The actual course project calls for defining a facet relevant to articles in Western News. If, however, you were asked to develop a thesaurus relating to international trade, defining a set of facets like commodity, process, source, and destination above, could be a useful way to define which kinds of terms to start collecting and what broad categories could be used to begin organizing them. In this case, you will notice that a single geographical category applies to two of the suggested facets and the same set of possible values could be used for both.

Illustrating canons

Differentiation

Suppose almost all the items to be indexed involved internal trade within a single country. Having facets for source and destination countries would then be relatively useless, since the answers would almost always be the same. (Perhaps, in such a case, the geographical facets could be modified to refer to smaller geographical regions.)

Non-concomitance

Suppose that the items did involve many different countries but that they almost always referred to domestic commerce. In that case, having separate facets for source and destination country would be wasteful, since a single facet for country would suffice.

Relevance

If the emphasis were strongly on the retail side and retailers and consumers were not generally concerned with what part of the world their purchases originated (not always true, of course), a source facet might not have sufficient relevance to be included. Conversely, if the emphasis were strongly on the production side, destination might be relatively irrelevant.

Ascertainability

Maybe searchers would like to select items based on the total cash value of the trade involved, as in "What articles discuss imports from Iceland last year valued at more than $10,000?" If the articles generally do not give that information, however, how much effort should indexers be expected to exert inferring it from other sources? Having a facet for cash value may not be worthwhile.

Permanence

Suppose there were a facet for cash value and indexers were expected to supply the correct answer by looking at the latest trade statistics. Later, the trade statistics might be updated. Next year, they might be different. Should the indexing be updated to conform?
Home

Last updated September 28, 2004.
This page maintained by Prof. Tim Craven
E-mail (text/plain only): craven@uwo.ca
Faculty of Information and Media Studies
University of Western Ontario,
London, Ontario
Canada, N6A 5B7