TexNet32 - Parameters

Before you can apply different extracting and weighting methods, you may wish to set some parameters. Select "Parameters" from the "Window" menu. TexNet32 will display the "Parameters" window.

First specify the minimum length of extracts in lines and as percent of a full text. The default values of 60 characters and 10 percent of a full text seem to work reasonably well for texts 1,000-4,000 words long.

You may also wish to change the occurrences threshold to be used for determining "frequent" keywords. By default, whenever you load a different source text or modify the source text in memory, this theshold is set to 3 or 1/1000 of the number of bytes, whichever is greater. You may wish to set the threshold lower in order not to leave out important keywords. The nature of texts should also be taken into account. The vocabulary of technical and scientific texts usually includes more terminology, jargon, generally speaking, more expressions the wording of which can not be changed. Thus, for those texts the threshold should be higher than for texts in social science and especially humanities, authors of which tend to rephrase every passage containing similar information.

You can set minimum and maximum percentages to control about how many of the words in a text are extracted in the "Unusual words" option. TexNet32 will generally try to get as close to the maximum as possible; but, in the case of a conflict between the minimum and the maximum, the minimum will take precedence.

The results of "frequent keyword", phrase, and other non-weighted extractions also depend on the stop list, and sometimes other ancillary lists, which you can examine in the "Ancillary lists" window.

The "Weight increments/decrements" indicate the amount of weight to be adjusted for each type of weighting. For example, when you apply the "Increment for paragraph/selection" function to a paragraph or a series of selected paragraphs, if the default value of the parameter has been used, 100 is added to the present weight of the paragraph or paragraphs. This will make it more likely for the paragraph(s) to be extracted.

You can save a set of parameters that you like in a script. If you want the parameters to be set automatically the next time you run TexNet32, you can name the script texnconf.txt.


Last updated February 5, 2008, by Tim Craven