The lists of responsory series, arranged in order from most to least similar,
offer the researcher an abundance of data, but what are we to do with the lists?
Exact matches of chant series are the easiest to manage; they can be listed, charted,
mapped or manipulated in a number of other ways to aid in representing affiliations
between manuscripts. What can be difficult, however, is to understand and demonstrate
the relationships between the many manuscripts whose results are in decreasing
order of similarity.
Although the lists generated by this database adequately show the relationships between one series and another, they do
not provide a clear overall picture of the relationships among all series.
For example, the series that differ by an Edit Distance of 1 from the "head series" are probably also quite
similar to one another. If the "head series" contains eight elements, every series in the group with an Edit Distance
of 1 will probably have at least six or seven elements in common with every other series at the distance of 1. However,
at a distance of 4, the relationships between the manuscripts become much less clear. One series in this group may have
the first four elements in common with the "head series," the next one the last four elements; therefore, these two series
could have zero elements in common with each other.
One of the electronic tools that provides an overall presentation is the "similarity
matrix." Similarity matrices are arrays of mathematical figures which present
similarity calculations. The degrees of similarity or dissimilarity among series
of chants (that is, the calculations of Edit Distance, LCS, Matches or Matches/Pairs)
can be represented in a similarity matrix. The calculations of Edit Distance among
these five series of chants, for instance:
can be represented in this similarity matrix:
Notice the zeros on the diagonal; these demonstrate that the self-similarity of
a chant series and itself is zero. No changes are needed to turn a series into
itself. The calculations of Edit Distance among the other series vary from "1"
(very similar) to "9" (quite dissimilar).
The representations within a similarity matrix become more understandable when they are drawn into a
"phylogenetic tree" or "dendrogram," a "branching diagram representing a hierarchy of categories based
on degree of similarity or number of shared characteristics."
(s.v. "Dendrogram,"Merriam-Webster Dictionary Online.)
The calculations for the similarity matrix shown above (4.2) can be transferred into a clearer visual representation
through the use of the dendrogram shown here:
The liturgical occasion (that is, the Sunday) is listed in the first of the columns on the right-hand side.
For this example, the chant series involved in the comparison are taken from the fourth Sunday of Advent ("A4").
The manuscript sigla are after the cursus (monastic "M" or secular "S"), followed by an indication of the dates
of the sources and a brief word concerning their provenance, the latter being abbreviated to twelve characters
owing to space restrictions.
This dendrogram shows that, for this group of five series, there is a fairly close-knit
group in Blg1+Kas05+Bar1. Pad3 is the outsider in this group, while Pra2 takes
an intermediary position. These conclusions are only valid within the group of
these five series.
4.4.i - Click on the "Dendrogram" tag on the left side of the screen.
4.4.ii - On the dendrogram page, select the Sunday series (or series to
be combined) which will form the basis of the dendrogram. (CTRL+click or SHIFT+click
to select more than one Sunday.)
4.4.iii - If desired, limit the sources employed in the dendrogram by cursus
(monastic or secular). Although some interesting observations
can be made when comparing sources of monastic cursus to secular ones, it
may be useful to apply this limitation by usage in order to manage the results.
4.4.iv - Select manuscript sources to be included in the dendrogram - With
Random MS Sources:
4.4.iv.I - The default number of manuscripts is 10.
4.4.iv.II - Any number can be chosen by filling in the "N" value. (As
of 2009, there are 900+ manuscript sources for the Advent Sunday series
and 150+ sources for Lenten Sundays.)
4.4.iv.III - Click the "Get N Random MS" button.
4.4.v - Select manuscript sources to be included in the dendrogram - By Sigla:
4.4.v.I - If particular manuscript sources are desired in order to
see how they interrelate on one or all Sundays, the sigla for those manuscripts
can be entered in the "Sigla" window. Use the alphanumeric
form as found on the Responsory Series "Search" page. Enter the desired
sigla separated by a space in the following manner: klo01 klo02 fire Albi.
You can also use wildcard characters: klo* fra*.
4.4.v.II - If specifying mss by sigla, check "Include random" to add
other random manuscripts to your matrix. (Note: If adding
random additional manuscripts, the programme will automatically exclude
from the random sources the manuscripts you selected by siglum in order
to prevent duplication.) The default number of random manuscripts added to your results is 10;
to increase or decrease this number, enter the desired amount into the "N" window (below the "cursus" pull-down box).
4.4.v.III - Check "Limit sigla to series and cursus"
to specify the usage of the manuscript data: the sigla you have entered
will be filtered with the selections you made above for Sunday series and
monastic or secular cursus.
4.4.v.IV - Click the "Get by Sigla" button.
At this point, you should be able to scroll down and see strings of
CantusIDNumbers, that is, the data of the chant series which will form the basis
of your dendrogram.
4.5.i - The default "compare method" remains Edit Distance for the dendrogram
portion of this website tool, but the other comparative
methods can be selected if desired.
NOTE: Edit Distance is the best comparative method for calculation
of the similarity matrix since the LCS and Matches/Pairs figures must first
be normalized (LCS owing to an incompatibility with series of different lengths
and Matches/Pairs because two numbers make up the result and the similarity
measure for the matrix and dendrogram must be a single number). For LCS, for
example, the maximum value between a five- and an eleven-chant series is seven:
all five chants are in the longer series, plus the beginning and ending indications
(5 + 2). To normalize this, whatever is the LCS number between the two series
is divided by the maximum it can be. Hence, self-similarity always turns out
as 1.0. Non-normalized versions could be a future experiment, as they would
account for the consideration that longer similar series are more similar
than shorter ones. In the case of Matches/Pairs, the calculation "nMatches+nPairs"
is used to obtain the similarity measure.
4.5.ii - The default "cluster method," or the way in which the similarities
are calculated, is "Ward's averaging." This method of calculation represents
a group/cluster as the average of its constituent items. The other cluster
methods available in the pull-down list are less useful in our analyses of
chant series.
4.5.iii - Edit the "title" as desired with respect to the Sunday series
and the number of manuscripts you have selected. This title will appear at
the top of your dendrogram.
4.5.iv - The "Chop Series After" window allows you to specify the maximum
length of the series under comparison. For example, you may wish to limit
the number of chants in any series to the first 9 (as might generally occur
for secular Matins). It is commonly known that responsories for use during
the weekdays are often copied after those for Sunday; manuscripts which are
lacking in specific rubrics (and have been indexed according to the most-recently-specified
feast day) may appear with 14, 15 or more responsories for "Sunday," and the
inclusion of these ferial chants in the comparative programme is likely to
skew the results.
4.5.v - The default value of the "Name length" window is "31".
The "name" is the combination of the Sunday series (2 characters),
the siglum (4 characters), date (max 7 characters), and provenance (max 12
characters), plus two spaces between each datum (total = 31). We recommend
not changing this value.
4.5.vi - You may wish to widen or narrow the "Paper width" to alter the
display of your dendrogram.
4.5.vii - Check the "Show simmat" box only if you wish to see the similarity
matrix on which the dendrogram is based. This will ordinarily remain unchecked.
4.5.viii - Click the "Make Dendrogram" button.
The dendrogram should appear in a separate window. Copy & paste the text to Notepad or a word-processor
or view the dendrogram on-screen.
When looking at the list of names down the right-hand side, beware that even though two series occur next to each other,
that does not mean that they are similar. Look via the connecting lines (instead of top to bottom) and locate the clusters.
Series within a cluster are more similar to each other than they are to series outside the cluster.
From the dendrogram alone we cannot determine exactly how similar (or dissimilar)
one chant series is to another. We can only see how relatively close one series is to
the cluster formed by the others in the comparison set.
The closer a vertical connecting line is to the series name, the closer the series are to one another.
When the connecting lines are a little higher in value (i.e., further to the left) the series are
less similar to one another; the relative distances away from each other are greater.
If more precise information is desired, return to the Responsory Series: Advent and Lent search
page to obtain the specific CantusIDNumbers and orderings of the chants in each of the series in question.
Dendrograms from this programme can be blocked/highlighted, copied (CTRL-C) and saved within Notepad or a word-processing programme.
Use a mono-spaced font (Courier, Letter Gothic, etc.) and adjust the font size and paper layout accordingly.
Dendrograms can also be captured as images on-screen; however, the screen size is a limitation in this method.
CANTUS dendrograms display the results of the comparison of sets of chant series
against one another. In this database, they provide a visual representation of
the relationships among manuscript sources of medieval chant.
It should be noted that these dendrograms are not full family trees reaching back along ancestral lines to a
single parent (such as might be an "Ur-form"), at least for the purposes of these chant series comparisons.
In the sciences, there are methods to programme software employing similarity matrices and dendrograms which
propose a first generation or a "parent cluster," but our main focus for this website database has been
to identify groupings of manuscript sources based on the similarity inherent in their usages and orderings of chants in series.
It is hoped that interested researchers will seize the opportunities available in this dendrogram-creation tool
to identify contexts for the sources on which they are currently working, to strengthen arguments for known,
suspected or disputed provenance, to place unknown sources within an appropriate liturgical tradition, to trace the
usages of particular chants, to gather evidence in the formulation of sweeping theories of chant transmission and
local customs in medieval Europe, or to forge new paths of research into the complicated web of medieval, ecclesiastical chant.