Автоматизация построения словаря на материале массива несловарных словоформ
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use an approach based on computing (closed) sets of attributes having large support (large extent) as clusters of similar documents. The method is tested in a series of computer experiments on large public collections of web documents and compared to other established methods and software, such as biclustering, on same datasets. Practical efficiency of different algorithms for computing frequent closed sets of attributes is compared.
Presentation of possibilities of the new for the humanities method of analyzing dynamics going into digital ranges - wavelet analysis method that, unlike traditional methods, does not constrain digital range mathematical characteristics strictly and also allows finding unobvious processes and natural laws is the aim of this work.