?
Определение центроидов для повышения точности порядково-инвариантной паттерн-кластеризации
The work continues the research of constructing methods for analyzing
patterns in parallel coordinates independent of the sequence of input data of the
results. The basic operations on objects of ordinal-invariant pattern clusters are
described. The assertion that the centroid of an ordinal-invariant pattern cluster
belongs to the original cluster is proved, which allows one to estimate the intracluster
object - centroid distances in the multidimensional feature space. Examples of
revealing the structural similarity of objects in parallel coordinates are given. The
main differences between the methods of analysis of patterns and cluster analysis
are noted. The methodology of the centroid detection of the ordinal-invariant pattern-
cluster is described. An algorithm for combining groups of objects based on
their structural similarity, on the one hand, and minimizing intracluster distances,
on the other, is proposed, which makes it possible to improve the accuracy of the
final results and partially solve the problem of finding similar objects in the presence
of error in the original data. The proposed algorithm uses the concept of intracluster
distances “object - centroid” and satisfies the following conditions: endogenous
determination of the number and composition of the desired groups of objects
under study; low (relatively) computational complexity; independence of the original
partition from the initial sequence of input data. The work of the proposed algorithm
on classical data sets is demonstrated. The results of testing are presented and
the clustering accuracy is increased.