The article presents the results of a digital editing project devoted to Leo Tolstoy’s The Collected Works, developed as an open access material. Our main source is the ninety-volume critical edition of Leo Tolstoy’s texts. The article portrays the developing of a metadata structure for the texts, which were divided into three categories: literary texts, diaries, and letters. The markers derive from the mechanism of the critical edition, and the mark-up itself allows for a creation of an image of Tolstoy’s evolution as a writer. The critical edition’s index also constitutes an important source of data, which was digitised as part of the project and developed as a specialised web service. This data supports, in particular, the construction of a network of references between the people and texts essential to Tolstoy. Furthermore, the layout of references in itself might become the basis of social network analysis, a methodological technique, popular in contemporary humanities. In the final part of the article, the ‘Textograph’ is discussed – a technical platform serving to render manuscripts digital, which was developed in order to digitise Tolstoy’s manuscripts, but can also be used to work on manuscripts of other writers.
This paper presents a solution for mining the biographical information from commentaries on Leo Tolstoy's letters. It is implemented as a part of Tolstoy Digital Project - a semantically marked-up web publication of the 90-volume complete collection of Leo Tolstoy's works. Extraction of relevant biographical information will be used to create an open database for all the persons who were somehow connected with Tolstoy or Tolstoy's works. The paper also accounts for various subtleties of the commentary apparatus and pays special attention to specific difficulties of biographical information extraction, such as the problem of defining the boundaries of expressions denoting profession, or the problem of non-standardized syntactic constructions for kinship relations.