The default "compare method" is Edit Distance.
"Edit Distance" is the minimum number of insertions, deletions, or substitutions
needed to turn one string into another. In this database, records returned with
an Edit Distance of zero are exact matches, and those with higher figures are
presumably less-closely affiliated. This is a dissimilarity measure.
"LCS" = Longest Common Sequence; this is a measure of similarity. The items do
not need to be contiguous in order to provide results. For the purposes of this
database, this method may prove useful when working with fragmented or partly-illegible
sources.
3.2.i Examples of LCS Comparison
For the two strings "abcdefg" and "a23cd4e567" the LCS is "acde," or "4".
"ABCDE" compared to "ABE" has an LCS of "abe" (length of 3).
"abcde" versus "ab12ce" has an LCS of "abce" (length of 4).
A note of warning: Because the number returned for LCS is the length of the longest
common substring between the two series in question, this number will vary depending on the lengths of the original series.
For example, a self-similarity of a series with nine responsories would be eleven (with the beginning and ending indications),
while a self-similarity of a series of twelve responsories would be fourteen; the longer series would be much more similar to itself,
then, than to the shorter series. Owing to these values, LCS will only provide appropriate results in the similarity matrix and
dendrogram functions when chant series of the same original length are compared.
This is the method employed by Hesbert in CAO vol. 5. The calculations are based on: 1) chants which are common to each series (i.e., the
number of "matches"), and 2) pairings of chants within the ordering of the chant series.
3.4.i Example of Matches/Pairs Comparison
"Start-A-B-C-D-fine" in one source compared to "Start-B-A-C-D-fine"
in another source would result in 4 matches and 2 pairs. Note that the indications of "Start" and "fine" allow for
comparison of the ordering of the chants at the beginnings and ends of series.