- Budget: EUR 120,000
- Duration: 1.1.2009 – 31.12.2012
- Funding: University of Helsinki Research Funds
- Project leader: Dr. Teemu Roos
- Keywords: stemmatology, textual criticism, phylogenetics
|
Algorithmic Methods in Stemmatology (STAM)
Research Project at
HIIT
Given a collection of imperfect
copies of a textual document, the aim of stemmatology is to
reconstruct the history of the text, indicating for each variant the
source text from which it was copied. The project develops theory and
methods for computer-assisted stemmatology, and evaluates the accuracy
of such methods in simulated and real data-sets.
Stemmatology lies at the intersection of several
scientific disciplines. On one hand, it is associated with humanities
where texts are used as sources, and on the other
 | Solving ancestral states by
dynamic programming. Photo: Tuula Roos.
|
 | Computer-based visualization of
multiple texts; see (Merivuori & Roos, 2009).
|
|
hand, to mathematics, statistics, and computer science, and finally,
to evolutionary biology and cladistics, the study evolution and
speciation. The aim of traditional stemmatology — or textual
criticism — has been to infer the original content of a textual
source based on a number of different versions. Modern
computer-assisted stemmatology has proven to be an extremely powerful
tool not only for the study of the alteration of texts but in giving
insight to the way the texts have been distributed geographically as
well. In doing so, stemmatology is answering several central questions
in historical, philological, and theological research.
Our objective is to develop reliable methods and tools for the study
of the origins, variation, and distribution of texts. An easy-to-use
method available on the internet, based on a sound methodology, would
significantly benefit a large group of scholars in a variety
humanistic disciplines. In computer science applications include,
e.g., the study of computer viruses and chain letters. Advances in
methods for textual scholarship also contribute to cladistics and
evolutionary biology.
The project is associated with two other projects
focusing on related topics: project Suomen keskiajan kirjallinen
kulttuuri (2007-2010), lead by Prof. Tuomas Heikkilä at the
Department of History, and a
Science Workshop on Stemmatology (2009-2010), funded by the
Finnish Cultural Foundation, lead by Prof. Petri Myllymäki,
Prof. Heikkilä and Dr. Roos.
The work is carried out mainly within the
Cosco group at the Helsinki Institute
for Information Technology HIIT.
People
- Teemu Roos,
PhD
- Senior researcher (project leader)
- Petri
Myllymäki, PhD
- Professor, Department of Computer Science
- Tuomas
Heikkilä, PhD
- Professor, Department of History
- Simo Linkola
- Research assistant
- Yuan Zou, MSc
- PhD student
Past students
- Anupam Arohi, MSc
-
- Toni Merivuori, MSc
Publications
- T. Roos and T. Heikkilä, (2009). Evaluating
methods for computer-assisted stemmatology using artificial benchmark
data sets, Literary and Linguistic Computing 24:4, pp. 417–433.
- T. Merivuori, (2009). Normalisoitu
kompressioetäisyys: katsaus sovelluksiin ("Normalized
compression distance: A review on applications"), Master's Thesis,
Department of Computer Science, University of Helsinki.
- T. Merivuori and T. Roos, (2009). Some observations on the applicability
of normalized compression distance to stemmatology, in
Proc. 2nd Workshop on Information Theoretic Methods in Science and
Engineering (WITMSE-09), Tampere, Finland, August 17–19.
- P.-H. Lai, T. Roos, and J. O'Sullivan, (2010). MDL hierarchical
clustering for stemmatology, IEEE International Symposium on
Information Theory (ISIT-10), Austin, Texas, June 13–18.
- Y. Zou, (2010). Structural
EM methods in phylogenetics and stemmatology,
Master's Thesis, Department of Computer Science, University of Helsinki.
- A. Arohi, (2011). Structural EM—An Algorithmi Method in
Stemmatology, Master's Thesis, Department of Computer Science,
University of Helsinki.
- T. Roos and Y. Zou, (2011). Analysis of Textual Variation by
Latent Tree Structures, in Proc. IEEE International
Conference on Data Mining.
Links
| |
|
Events |
|
May 18, 2009, Helsinki.
"Darwin – banaanikärpänen –
stemmatologia". Colloquium organized by the
VARIANTTI
network on textual criticism (in Finnish).
Speakers: Tuomas Heikkilä
and Teemu Roos.
»» event details
May 29, 2009, Bern. Tuomas Heikkilä speaks about
experiments with artificial manuscript traditions in a one-day
symposium organized by H.F. Windram, C.J. Howe, and M. Stolz.
June 17, 2009, Tikkurila. Teemu Roos gives an introduction
to computer-assisted stemmatology at the VARIENG Spring Excursion.
Place: Finnish Science Center, Heureka.
»» VARIENG
unit
August 1, 2009. Yuan Zou joins STAM as a full-time
employee.
August 19, Tampere. Toni Merivuori and Teemu Roos
present a paper on stemmatology at the 2nd WITMSE workshop.
»» workshop
page
August 27, 2009, Helsinki. Final seminar of HIIT summer
interns. Anu Sulander and Anupam Arohi present results of
summer internship in STAM. Place: Kumpula Campus.
October 28, 2009, Helsinki. Open House at the Department of
Computer Science. Demonstration by STAM project.
Place: Kumpula Campus.
January 28–30, 2010, Helsinki. We organized
a stemmatology workshop in Helsinki.
»» workshop
page
April 2, 2010, New Haven. Teemu Roos gives a talk on
algorithms of stemmatology at Yale University.
»» Yale
Probabilistic Networks Group
April 22, 2010, St. Louis. Teemu Roos gives a talk
on algorithms of stemmatology at the Washington University in
St. Louis.
»» WUSTL
June 21–23, 2010, Uppsala. The second workshop in our
series. Attendance is by invitation only.
»» SCAS
November 21–24, 2010, Pisa. The third workshop in
our series. Attendance by invitation.
»» workshop
page
December 17, 2010, Helsinki. Yuan Zou graduates (MSc) with a Master's
thesis on stemmatology.
March 22–25, 2011, Cambridge, UK. Fourth workshop in our series.
Attendance by invitation.
»» workshop
page
August 15, 2011. Anupam Arohi graduates with a MSc thesis
on stemmatology.
October 5–8, 2011, Rome, Italy. Fifth workshop in our
series.
Attendance by invitation.
»» workshop
page
December 11–14, 2011, Vancouver, Canada.
Teemu Roos and Yuan Zou present a paper on a new stemmatological
method, Semstem, at the ICDM conference.
»» conference
page
March 23, 2012, University of East Anglia, UK.
Teemu Roos speaks at the Computational Biology Seminar at University
of East Anglia.
»» seminar
November 22–24, 2012, University of Bern, Switzerland.
Teemu Roos speaks at the Phylomemetic and Phylogenetic Approaches in
the Humanities Workshop.
»» workshop
page
|
|