Visualisation of Russian newspaper corpus by means of reference graphs
In this paper we present some preliminary results for text corpus visualization by means of so-called reference graphs. The nodes of this graph stand for key words or phrases extracted from the texts and the edges represent the reference relation. The node A refers to the node B if the corresponding key word / phrase B is more likely to co-occur with key word / phrase A than to occur on its own. Since reference graphs are directed graphs, we are able to use graphtheoretic algorithms for further analysis of the text corpus. The visualization technique is tested on our own Web-based corpus of Russian-language newspapers.