Thew32

Contents


What is TheW32?

TheW32 is a 32-bit Microsoft Windows program for creating and maintaining a thesaurus. It is similar to the earlier 16-bit TheW thesaurus program.

"TheW" is pronounced like "thew" (meaning tendon, muscle, strength, virtue, or custom).


Installing TheW32

If you have acquired TheW32 in the self-extracting archive Thew33.exe, just run Thew33.exe to install TheW32 and the various ActiveX controls required for some of its functions. An icon for TheW32 will be added to your Start menus.

If you have a copy of the TheW32 program but do not install it - for example, if you have access to it over a network - you can run it, but one or more of the ActiveX controls may not be available.


Running TheW32

To start TheW32, run the Thew.exe file or double-click on the "TheW" icon.

To open a particular thesaurus automatically, run the Thew.exe file with the thesaurus' pathname as its first parameter; e.g., thew.exe mythesau.the.

To end a TheW32 session, click on the "X" button on the main TheW32 window.


The TheW32 interface

TheW32 generally uses typical 32-bit Microsoft Windows interface elements, supporting keyboard and mouse, menus, and accelerator keys. A few functions require the use of the mouse or other pointing device. To close a window in TheW32, if not otherwise indicated, click on its "X" button.

The Thesaurus window

When you begin TheW32, no thesaurus is open and most of the menu items are grayed. Once you have opened a new or existing thesaurus, all the menu items will be enabled (unless the thesaurus is read-only), and you will see portions of the thesaurus displayed in the Thesaurus window.

At the top left is an edit control containing the current heading term. This control can also be used temporarily for terms to be added or searched for; edit the contents and press Enter to add a new term or search for an existing term. To the right is a spin edit control for the current linktype. You can use this control to change the current linktype; you can also set the linktype to any of the values from 0 to 9 from the edit control to the left by holding down the Ctrl key and pressing the corresponding number key; another shortcut is to select the spin edit control and press the key for the first letter of the corresponding mnemonic.

To the right of the spin edit control is an edit control showing the mnemonic for the current linktype; you can use this edit control to change the mnemonic if you wish.

Below to the left is a grid showing the references for the current heading term; to the right is a list box showing the terms in the thesaurus; between these is a button that you can use for making and breaking links between the current heading term and other terms in the thesaurus.

From the thesaurus window, use the menu or any associated accelerator keys to access the main thesaurus functions. To adjust the relative sizes of the references grid and the terms list use the mouse to drag the splitter to the right of the terms list.


File menu

New. Allows you to create a new thesaurus. When TheW32 creates a new thesaurus, it initially makes it a copy of the template file Template.the, if the template file is available.

Open. Allows you to open an existing *.the thesaurus, or to create a thesaurus from a *.txt text file in the export/import format described below. If you choose the latter, you will be prompted for a field delimiter and for a *.the file containing linktypes and other settings to use. If you click on "Cancel" in response to this settings file dialog, TheW32 will just create a linktype for each mnemonic in the order in which it is encountered in the text file. Any records in the *.txt file that appear out of order will be ignored.

Export terms and links, Import terms and links. The default format used for import and export is

TERM1!link type mnemonic!term2
For example,
CATS!BT!ANIMALS
This format is compatible with that used by the THSRS thesaurus construction package. TheW32 also allows a single-character delimiter other than "!" to be specified.

The import and export format does not include linktype definitions. If TheW32 is importing terms and links and encounters a mnemonic that does not belong to a linktype already defined in the current thesaurus, it will prompt you to define a new linktype.

Convert from....

Tabbed text format. Allows you to import data from a plain text file, with tabs used for indentation, like the ones that can be created from the RTF preview of this package

TGM format.... Allows you to import data from a file similar to the text version of the Library of Congress Thesaurus for Graphic Materials. A text file in this format is double-spaced, with each heading, except the first, preceded by two additional empty lines; the mnemonic for each link under a heading begins at the start of a line; enough blanks then follow to make up a total of 10 before the beginning of the linked term; lines other than the first line of the heading that do not contain a sequence of two or more blanks are continuations of the lines that precede them.

If TheW32 is converting a file and encounters a mnemonic that does not belong to a linktype already defined in the current thesaurus, it will prompt you to define a new linktype.

Note that converting a large thesaurus file will take a considerable amount of time. To create a complete thesaurus from existing data, using the "Open" function on a properly formatted text file is likely to be much quicker.

Create template. Allows you to create a new thesaurus with the same linktype definitions and report formats as the current thesaurus, but with no terms or links.

Load settings.... Allows you to load linktype and report definitions from another TheW32 thesaurus, even if you do not have write access to that thesaurus. This may be handy for restoring the default settings from Template.the. Linktype and report definitions with numbers and names not used in the other thesaurus will be retained in your thesaurus.

Pack Restructures the thesaurus file to make it more compact and improve access time. When packing is complete, a display of before-and-after statistics will appear. Errors in the file may also be noted. To hide this display, click on its "Close" button.

Save As.... Allows you to make an exact copy of the thesaurus file under another name, usually for backup purposes.

Export to XML.... Calls up a dialog allowing you to create an XML file from the current thesaurus in VOCML-compatible form. The dialog allows you, if you wish, to specify the location of the DTD, the workNum and displayTitle values in the header, the conversion rules, and the output file. The part of a conversion rule before the colon is the linktype number, and the part after the colon is either the VOCML tag to be applied or, in the case of exclude, an instruction not to create an SVTerm element for a term that has a link of that type to any other term. No checking will be applied to modified conversion rules, but tags other than the following will be ignored: alt, child, cla, definition, misc, note, parent, relatedTerm, syn, typedRelation.

Export to JavaScript.... Calls up a dialog allowing you to create a JavaScript version of the thesaurus. The script will assign the linktype mnemonics to an array called mnemonics, the terms to an array called terms, and term references and notes in a compacted form to an array called links. In each string in links, $ is used to introduce a linktype number and # is used to introduce either a literal note or a term reference number; for example, "$1#Alphabetic characters$4#4$6#16#17" (corresponding to a term LETTERS) includes one literal scope note (linktype 1), an NT reference (linktype 4) to term 4 (CHARACTERS), and RT references (linktype 6) to terms 16 and 17 (TYPES and WRITING). For a simple example of an HTML file that uses a JavaScript file exported by TheW32 see Thewjscr.htm.

Print options. Allows you to set options for previewing or printing a formatted version of the thesaurus.

When you select this option, the "Options" dialog appears.


The "Options" dialog

Use the "Name:" edit control to change the title of the thesaurus that will appear on printed reports.

Use the "From:" and "To:" boxes to specify a subrange of headings to include in the report. The default is A-Z or all headings beginning with a letter of the alphabet, but much narrower ranges are recommended for RTF previews of larger thesauri.

If you have saved report formats, select the one that you want from the drop-down list under "Report format". (You can also select the report format in the RTF preview window.) If the list is empty, you have no report formats saved; in this case, by default the report will be just a list of terms. Click on "Modify" to change the current report format definition.

For HTML format, the default "Split on" setting of 0 means that a single file will be created. If you select a higher number (from 1 to 7), multiple files will be created, with the main file being a table of contents. Names of the additional files will be derived automatically from that of the main file. If necessary, the name assigned to the main file will be truncated to allow the additional file names to be no more that 8 characters in length.

Use the "Pathname" box to specify a pathname other than the default "out.htm" for HTML output.

If you are creating multiple files automatically, avoid filenames that might cause needed files to be overwritten. For example, if you are creating multiple files in the TheW32 folder, do not choose a filename beginning with "the".

Use the "Use target list" box to specify the pathname of a file to be used to create additional links from terms in the HTML page(s). Each line in the file should consist of a URL enclosed in double quote marks, a comma, and a term enclosed in double quote marks; for example,


    "alpha.htm#CATS","CATS"

Use the "Create target list" box to specify the pathname of a file in which to store a target list for named anchors in the HTML file.

If you invoke the two target list options in the sequence "Create - Use" you can produce automatic linking between instances of the same term in two different displays of the same thesaurus; for example, from a term in a hierarchical display to more detailed information in the corresponding alphabetical display.

You can employ the "Use target list" option alone to insert other kinds of links; for example, to entries in an index.

The target list format is compatible with that used by XRefHT32.

For printing, use the spin edits to set the sizes of the margins as percentages of the page width and height. Note that, if you set margins too large, some or all of the information may be omitted from the printout.


RTF preview. Displays your formatted thesaurus in a Rich Text Format editing window. In this window, you can select the report format, or perform minor editing on the output, print it, or save it in plain text or simple RTF format. In the saved format, the Tab character is used to indicate indentations.

Summary. Displays a brief statistical summary in the Rich Text Format editing window. Statistics include the total number of terms, the number of preferred terms (based on excluding any term linked to another term via the keyword linktype or via a linktype weighted at or above the standardization threshold), term length statistics in characters (mean, standard deviation, minimum, maximum), and total uses of each linktype used.

HTML. Saves your formatted thesaurus as one or more HTML files and displays it using whatever application is associated with .htm files on your workstation.

Print.... Allows you to print your formatted thesaurus using any of the printer drivers installed on your workstation.


Edit menu

Copy term. Puts a copy of the currently focused term into the Windows clipboard.

Delete. Effect varies slightly depending on what is in focus: if the references grid is in focus, the option removes the selected reference and any reciprocal; if the terms list box is in focus, the option removes the selected term from the thesaurus.

Add reference/note(s). Allows you to add references and scope notes and link either old or new terms. When you select this option, an "Adding references/notes" dialog appears. By default, the first term in this dialog is the current heading and the linktype is set to match the linktype spin edit in the thesaurus window. If you click on "OK", any reference or note that you specified is added, and you are prompted for another. Press Esc or click on "Cancel" to return to the thesaurus window.

Clip. Allows you to put a copy of the currently focused term, or another term that you type in, into the thesaurus clipboard.

Add term(s). Allows you to add one or more new terms to the thesaurus. When you select this option, an "Add term(s) to thesaurus" dialog appears. If you type a term and press Enter or click on "OK", the term is added to the thesaurus, and you are prompted for another. Press Esc or click on "Cancel" to return to the thesaurus window. (You can also add terms directly in the main window, as noted above.)

Link. Sets the link from the current heading term to the selected term in the terms listbox to the current linktype shown by the linktype spin edit. Clicking on the "<" button to the left of the terms listbox has the same effect. (The menu item or its keyboard shortcut has the advantage that it can be used without leaving the term edit control.)

Note that the default value of the current linktype is 0, signifying no link.

Explode clipboard. Uses existing thesaurus clipboard terms, links in the thesaurus, and linktype weights, to add weighted terms and notes to a list called the "explosion" and then displays the "Explosion" window. If a term or note is assigned a weight less than the explosion threshold, it is not added to the explosion.

Modify reference/note. Allows you to modify a reference or note that you have selected in the references grid. When you select this option, a "Modifying reference/note" dialog appears. If you make a modification and press Enter or click on "OK", the old reference or note is replaced by your modification, and you are returned to the thesaurus window. To return to the thesaurus window without making a modification, press Esc or click on "Cancel".

Duplicate references. Allows you to copy all the references and notes from the current heading term to another term. When you select the option, a "Copy references to" dialog will appear, with the current heading term in an edit box. To copy the references and notes, type in the target term and press Enter or click on "OK". To return to the thesaurus window without copying, press Esc or click on "Cancel".

Modify heading Allows you to modify the current heading while keeping all its references and notes. When you select the option, a "Change heading to" dialog will appear, with the current heading term in an edit box. To modify the heading, type in the new heading and press Enter or click on "OK". To return to the thesaurus window without modifying, press Esc or click on "Cancel".

Windows clipboard - References. Puts references into the Windows clipboard. Each line of the text that was in the Windows clipboard is looked up as a term in the thesaurus. Each reference to every term is then copied onto a separate line in the Windows clipboard, consisting of the first term, a Tab, the linktype mnemonic, a Tab, and the second term.

Windows clipboard - Enrich weighted stems. Uses the thesaurus to enrich a list of weighted stems stored on the Windows clipboard. Each line in the list consists of a weight (typically from 0 to 100), a Tab, and a stem; for example,

100	PLANT
50	TREE

Windows clipboard - Standardize. Uses the thesaurus to standardize the vocabulary of the text stored on the Windows clipboard. For this function, any link with a weight equal to or greater than the standardization threshold is assumed to point from a nonpreferred term to a preferred term. The default value of the standardization threshold is 95.

Windows clipboard - Suggest terms. Uses the thesaurus to change the contents of the Windows clipboard from a free text to a suggested list of preferred terms from the thesaurus. Each term is included on a separate line followed by a tab and the number of occurrences of words or phrases in the text that mapped to the term. For example,

TERMS	4

Windows clipboard - Links. The uninverted form of "Windows clipboard - References". Each line of the text that was in the Windows clipboard is looked up as a term in the thesaurus. Each reference from every term is then copied onto a separate line in the Windows clipboard, consisting of the first term, a Tab, the linktype mnemonic, a Tab, and the second term.


View menu

Term as heading. Sets the current heading to the term selected in the term list box and then focuses the heading edit box.

Clipboard. Displays the TheW32 "Clipboard" window. In this window, you can view the contents of the TheW32 clipboard, delete terms from it, insert terms into it, add a term from it to the thesaurus, copy it to the Windows clipboard, or reset its contents from the Windows clipboard, one term for each line.

Terms Focuses the terms list box.

References for term. Sets the current heading to the term selected in the term list box and then focuses the references grid.

Double-clicking (or pressing Enter) on a term in the terms list box has the same effect.

(Double-clicking or pressing Enter on a term in the references grid sets the current heading to that term if it is a heading in the thesaurus.)

Explosion. Displays the "Explosion" window.

Linktype Focuses the linktype spin edit.

Seek term. Allows you to find a term in the terms listbox. When you select this option, a "Seek term" dialog appears. To find a term, type it in and press Enter or click on "OK". To leave the terms listbox as it was, press Esc or click on "Cancel".

When the terms listbox is focused, you can also move to a particular position in the alphabet by pressing a letter key.

Find string. Allows you to find the next term in the terms listbox that contains a particular sequence of characters. When you select this option, a "Find string" dialog appears. To find the term, type in the character sequence (case does not matter) and press Enter or click on "OK". To leave the terms listbox as it was, press Esc or click on "Cancel".


Graphics menu

Term, Explosion. Both these options display a "Graphic display" window based on the contents of the explosion. The difference is that the "Explosion" option simply takes the existing contents of the explosion, whereas the "Term" option first creates a new explosion based on the term that was focused.

Use the menus or their associated accelerator keys for the main functions in the "Graphic display" window.

The current box is identified by its distinctive color and by the handle at its bottom right corner. To set the current box, click the mouse or press the Tab key. To adjust the size of the current box, use the mouse to drag the handle. You can also drag the current box itself to a different position in the display to improve readability.

To close the "Graphic display" window, click on its "X" button. When you close the window, if you have selected a thesaurus heading term it will be made into the current heading.


Other menu

Explosion threshold, Standardization threshold. Allows you to reset the explosion threshold and standardization threshold respectively. When you select one of these options, an "Explosion threshold" or "Standardization threshold" dialog appears, showing the current theshold value. To change the value, type in the new value (a number from 0 to 100) and press Enter or click on "OK".

Standardize plurals separately. Checking this item changes standardization for the thesaurus so that final S is no longer separated from words or phrases for purposes of matching.

Define linktype. Allows you to change the definition of the current linktype indicated in the linktype spin edit. When you select this option, an "Editing linktype" dialog is displayed.

A link type definition contains a link type number, a mnemonic, a weight, and an arrow style. In addition, the definition of a link type that is not a compound reciprocal contains a reciprocal link type and a compound reciprocal link type. Zero always indicates no link type.

A compound reciprocal link type is used to link terms to another term that is linked, not to them individually, but to a compound term in which they appear separated by "+". For example, suppose that "BOATS" is linked to "TRANSPORTATION+WATER" with a link of type 2 and that the compound reciprocal of link type 2 is 8; then links of type 8 will be made from "TRANSPORTATION" to "BOATS" and from "WATER" to "BOATS".

The file Template.the contains a predefined set of link type definitions.

Keyword link type. Allows you to specify a keyword linktype, a special link type that is used to link keywords, other than the first, to the multiword term in which they occur. If the keyword linktype is 0, no thesaurus entries will be created for keywords. When you select this option, a "Confirm" dialog will appear, asking whether you want to set the keyword linktype to the value in the linktype spin edit.

If a keyword linktype is specified, entries for keywords are added automatically when multiword terms are added. To distinguish them from conceptual terms, keywords added in this way are marked by a single appended special character (ASCII character 0, rendered as  in the main display). Such entries cannot be added or modified by hand and are best left alone. They are not exported when you choose to export terms and links, but are generated automatically when you import terms and links. They are automatically deleted when the corresponding multiword terms are deleted.

Incompatible linktypes. Allows you to modify the set of incompatible linktype pairs. These are pairs of linktypes that are prevented from both being used from the same term. When you select this option, an "Incompatible linktypes" dialog appears showing the current list of incompatible linktype pairs. To delete a pair, click on it and click on "Delete".

To add a pair, set the "1" and "2" spin edit values and click on "Insert".

When you are finished modifying the incompatible linktypes list, click on "Close" to return to the thesaurus window.

Transform structure. Applies the structure transformations that are checked in the "Structure transformations" dialog.

Structure transformations. When you select this option, a "Structure transformations" dialog appears.

Report format. Allows you to modify the current report format, save it, or delete or retrieve a saved report format. When you select this option, a "Report formats" dialog appears.


The "Report formats" dialog

Name. Use this edit box to give a name to your report format.

Levels. Use this spin edit to specify how many levels of terms to allow: 0 for no subheadings, 1 for one level of references, and so on.

Excluder linktypes, Followed linktypes. If a link from one term to another belongs to an excluder link type, the first term is excluded from the report as a main heading, though it may be a subheading.

If a link from one term to another belongs to a followed link type, the link will be followed in producing the report, up to the number of levels specified in the report format.

To add to one of these lists, set the spin edit below it and click on its "Add" button. To delete a linktype from one of the lists, click on it and click on the "Delete" button below.

Substitution linktypes. Use these spin edits to specify linktypes to forms to be substituted for the actual terms in RTF and HTML output respectively. For both RTF and HTML, this would allow for displays of specific terms in various mixtures of upper and lower case, punctuation such as quotation marks, and special characters in the Latin-1 character set such as accented letters. For HTML output, display substitutes could include HTML tags such as those for italics or images, or character entity references or two-byte UTF codes for characters in non-Western European writing systems (in the latter case, you are advised to edit the HTML output files to contain the appropriate "content-type" meta tag for the character set used).

Delete (under "Stored formats"). Deletes the stored report format selected from the list to the right.

Save. Saves the report format under the name that you have given to it as part of the thesaurus file.

Retrieve Retrieves the stored format selected from the list to the right, making it the current report format.

The file Template.the contains a predefined set of report formats:
ALPHA A standard alphabetical display: SN, USE, UF, BT, NT, and RT references to one level.
SHORT Preferred terms only, with no references.
TREE A hierarchical display, starting with top terms and working down through NT references only to as many as nine levels.

Close. Click on this button to return to the thesaurus window from the "Report formats" dialog.


Graphic options. Allows you to set some options for the graphic views of the thesaurus. When you select this menu option, a "Graphic options" dialog appears.


The "Graphic options" dialog

The "Graphic options" dialog allows you to specify text compaction and graph style for graphic displays.

The text compaction options are described in

All the styles of graph options, with the exception of "Spreading activation", are described in

Each cell in the "Direction preferences for arrow styles" grid contains a compass rose for a different arrow style. Direction of the arrow (if any) in the compass rose indicates preferred direction in Watanabe-style displays. Arrow length indicates strength of preference. You may click with the mouse to reset arrow direction and length.


Ancillary lists. Displays the stoplist and suffix list in an "Ancillary lists" dialog. When you start TheW32, these lists are as defined in the files Stoplist.txt and Suffixes.txt. You can make temporary changes to the lists by editing them in the dialog.

Spellcheck - Headings. If the VisualSpeller ActiveX control has been correctly installed, spell checks all heading terms, allowing you to make changes.

Spellcheck - Notes. If the VisualSpeller ActiveX control has been correctly installed, spell checks all non-heading terms or notes, allowing you to make changes.

Stay on top. When this item is checked, the main TheW32 window is set to display on top of normal windows.


Help menu

Help. Displays the help file using whatever application is associated with .htm files on your workstation.

About. Displays version and copyright information.


Files

TheW32 makes use of the same *.the thesaurus file format as TheW and older versions of TexNetF.

The stoplist (Stoplist.txt) consists of upper-case words in alphabetical order, one line to a word; the suffix file (Suffixes.txt) is similar, but is arranged in reverse order of length. These files normally reside in the application directory, but may be overridden by identically-named files in a distinct working directory.

Template.the is a thesaurus file with no entries, but with definitions for standard link types and report formats.

*.htm and *.jpg files accompanying the TheW32 package provide online help, but are not essential to running the program.

TheW32 is not guaranteed to be free of bugs, though bugs will be fixed as they are detected. Microsoft Windows itself will protect against some of the more destructive effects of undetected bugs. Certain types of bug, however, can cause Microsoft Windows to become unstable, requiring rebooting.

A power failure or reboot while running TheW32 may cause the thesaurus file to become corrupted if it was being updated. Keeping a backup of this file is therefore especially recommended.

It is possible for more than one instance of TheW32 to have the same thesaurus current at the same time. If two instances update a thesaurus at the same time, the thesaurus may become corrupted. Therefore, if you anticipate multiple simultaneous use, you should set the thesaurus file to read only, cancelling this setting only long enough to allow a single instance to open it for update if desired.


Last updated February 6, 2008, by Tim Craven