The CANTUS Database

Responsory Series: Advent and Lent

Click here to go to the Responsory Series
(will open in a new window)

4. Dendrograms (the "Dendrogram" tab)

4.1 Interpreting the Results of Comparisons

The lists of responsory series, arranged in order from most to least similar, offer the researcher an abundance of data, but what are we to do with the lists? Exact matches of chant series are the easiest to manage; they can be listed, charted, mapped or manipulated in a number of other ways to aid in representing affiliations between manuscripts. What can be difficult, however, is to understand and demonstrate the relationships between the many manuscripts whose results are in decreasing order of similarity.

Although the lists generated by this database adequately show the relationships between one series and another, they do not provide a clear overall picture of the relationships among all series.
For example, the series that differ by an Edit Distance of 1 from the "head series" are probably also quite similar to one another. If the "head series" contains eight elements, every series in the group with an Edit Distance of 1 will probably have at least six or seven elements in common with every other series at the distance of 1. However, at a distance of 4, the relationships between the manuscripts become much less clear. One series in this group may have the first four elements in common with the "head series," the next one the last four elements; therefore, these two series could have zero elements in common with each other.

4.2 Similarity Matrices and Cluster Analysis

One of the electronic tools that provides an overall presentation is the "similarity matrix." Similarity matrices are arrays of mathematical figures which present similarity calculations. The degrees of similarity or dissimilarity among series of chants (that is, the calculations of Edit Distance, LCS, Matches or Matches/Pairs) can be represented in a similarity matrix. The calculations of Edit Distance among these five series of chants, for instance:

5 Sources - List

can be represented in this similarity matrix:

Similarity Matrix - 5 sources
Notice the zeros on the diagonal; these demonstrate that the self-similarity of a chant series and itself is zero. No changes are needed to turn a series into itself. The calculations of Edit Distance among the other series vary from "1" (very similar) to "9" (quite dissimilar).

4.3 Dendrogram Example

The representations within a similarity matrix become more understandable when they are drawn into a "phylogenetic tree" or "dendrogram," a "branching diagram representing a hierarchy of categories based on degree of similarity or number of shared characteristics." (s.v. "Dendrogram," Merriam-Webster Dictionary Online.)

The calculations for the similarity matrix shown above (4.2) can be transferred into a clearer visual representation through the use of the dendrogram shown here:5 Sources - Dendrogram

The liturgical occasion (that is, the Sunday) is listed in the first of the columns on the right-hand side. For this example, the chant series involved in the comparison are taken from the fourth Sunday of Advent ("A4"). The manuscript sigla are after the cursus (monastic "M" or secular "S"), followed by an indication of the dates of the sources and a brief word concerning their provenance, the latter being abbreviated to twelve characters owing to space restrictions.

This dendrogram shows that, for this group of five series, there is a fairly close-knit group in Blg1+Kas05+Bar1. Pad3 is the outsider in this group, while Pra2 takes an intermediary position. These conclusions are only valid within the group of these five series.

4.4 Creating a Dendrogram - Step 1

At this point, you should be able to scroll down and see strings of CantusIDNumbers, that is, the data of the chant series which will form the basis of your dendrogram.

4.5 Creating a Dendrogram - Step 2

The dendrogram should appear in a separate window. Copy & paste the text to Notepad or a word-processor or view the dendrogram on-screen.

4.6 Reading A Dendrogram

When looking at the list of names down the right-hand side, beware that even though two series occur next to each other, that does not mean that they are similar. Look via the connecting lines (instead of top to bottom) and locate the clusters. Series within a cluster are more similar to each other than they are to series outside the cluster.

From the dendrogram alone we cannot determine exactly how similar (or dissimilar) one chant series is to another. We can only see how relatively close one series is to the cluster formed by the others in the comparison set.

The closer a vertical connecting line is to the series name, the closer the series are to one another. When the connecting lines are a little higher in value (i.e., further to the left) the series are less similar to one another; the relative distances away from each other are greater.

If more precise information is desired, return to the Responsory Series: Advent and Lent search page to obtain the specific CantusIDNumbers and orderings of the chants in each of the series in question.

4.7 Saving/Exporting Dendrograms

  1. Dendrograms from this programme can be blocked/highlighted, copied (CTRL-C) and saved within Notepad or a word-processing programme. Use a mono-spaced font (Courier, Letter Gothic, etc.) and adjust the font size and paper layout accordingly.
  2. Dendrograms can also be captured as images on-screen; however, the screen size is a limitation in this method.

4.8 Research Involving Dendrograms: Some Comments

CANTUS dendrograms display the results of the comparison of sets of chant series against one another. In this database, they provide a visual representation of the relationships among manuscript sources of medieval chant.

It should be noted that these dendrograms are not full family trees reaching back along ancestral lines to a single parent (such as might be an "Ur-form"), at least for the purposes of these chant series comparisons. In the sciences, there are methods to programme software employing similarity matrices and dendrograms which propose a first generation or a "parent cluster," but our main focus for this website database has been to identify groupings of manuscript sources based on the similarity inherent in their usages and orderings of chants in series.

It is hoped that interested researchers will seize the opportunities available in this dendrogram-creation tool to identify contexts for the sources on which they are currently working, to strengthen arguments for known, suspected or disputed provenance, to place unknown sources within an appropriate liturgical tradition, to trace the usages of particular chants, to gather evidence in the formulation of sweeping theories of chant transmission and local customs in medieval Europe, or to forge new paths of research into the complicated web of medieval, ecclesiastical chant.


Last update of this page = 10 August 2009 Contains software or other intellectual property copyright © 2007-2009, Debra Lacoste and Gerard Stafleu.