A-Wardpβ: Effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation
This article is devoted to the analysis of coherence of financial recommendations with respect to securities of the Russian companies. The study is based on the analysis of approximately 4000 recommendations and forecasts of 23 investment banks with respect to around forty securities of Russian stock market over the period of 2012-2014 years. The predictive history of each of the investment bank was considered as evidence in the framework of evidence theory. The coherence of recommendations was evaluated with the help of the so-called conflict measure between the evidence, which determined on the subsets of the set of all evidence. Then the study of coherence was reduced to analysis of values of the conflict measure. This analysis was performed with the help of game-theoretic methods (Shapley index, interaction index), network analysis methods (centralities), fuzzy relation methods, hierarchical clustering methods.
Recently, a three-stage version of K-Means has been introduced, at which not only clusters and their centers, but also feature weights are adjusted to minimize the summary p-th power of the Minkowski p-distance between entities and centroids of their clusters. The value of the Minkowski exponent p appears to be instrumental in the ability of the method to recover clusters hidden in data. This paper advances into the problem of finding the best p for a Minkowski metric-based version of K-Means, in each of the following two settings: semi-supervised and unsupervised. This paper presents experimental evidence that solutions found with the proposed approaches are sufficiently close to the optimum.
This paper represents another step in overcoming a drawback of K-Means, its lack of defense against noisy features, by using feature weights in the criterion. The Weighted K-Means method by Huang et al. is extended to the corresponding Minkowski metric for measuring distances. Under Minkowski metric the feature weights become intuitively appealing feature rescaling factors in a conventional K-Means criterion. To see how this can be used in addressing another issue of K-Means, the initial setting, a method to initialize K-Means with anomalous clusters is adapted. The Minkowski metric based method is experimentally validated on datasets from the UCI Machine Learning Repository and generated sets of Gaussian clusters, both as they are and with additional uniform random noise features, and appears to be competitive in comparison with other K-Means based feature weighting algorithms.
Over the last few decades, performance-based funding models of universities have been introduced and have made universities build and implement different strategies to enable them to compete and be viable in changing circumstances. In turn, national governments are focused on providing universities with more opportunities to run efficient programmes that advance higher education. This paper includes a detailed review of various taxonomies for structuring university. More importantly, it develops a typology of higher education institutions that is relevant for the Russian context. The Ward method is used to cluster universities on the basis of university distinctions in terms of the availability of resources, education, and research and development. This typology of universities is verified by assessing their efficiency score gained from modified Data Envelopment Analysis,incorporating universities' heterogeneity. Finally, the paper gives a decision tree for classifying universities bearing in mind their diversity. It might be expanded for abroader set of inputs and outputs, namely external projectbased research funding modes and cooperation between universities and industry to pursue the development of innovation. The results can be used for shaping targeted policies aimed at particular university groups
The Minkowski weighted K-means (MWK-means) is a recently developed clustering algorithm capable of computing feature weights. The cluster-specific weights in MWK-means follow the intuitive idea that a feature with low variance should have a greater weight than a feature with high variance. The final clustering found by this algorithm depends on the selection of the Minkowski distance exponent. This paper explores the possibility of using the central Minkowski partition in the ensemble of all Minkowski partitions for selecting an optimal value of the Minkowski exponent. The central Minkowski partition appears to be also a good consensus partition. Furthermore, we discovered some striking correlation results between the Minkowski profile, defined as a mapping of the Minkowski exponent values into the average similarity values of the optimal Minkowski partitions, and the Adjusted Rand Index vectors resulting from the comparison of the obtained partitions to the ground truth. Our findings were confirmed by a series of computational experiments involving synthetic Gaussian clusters and real-world data
The problem of management of the nonlinear object which is exposed to impact of uncontrollable indignations, is considered in a key of differential game. Synthesis of optimum managements is made with application of transformation of the nonlinear equation of initial object in the differential equation with the parameters depending on a condition. The square-law functional of quality allows to formulate synthesis conditions in the form of need of search of solutions of the equation of Rikkati. The solution of the equation of Rikkati with the parameters depending on a condition, is in a symbolical view with application of algebraic methods that allows to generalize a number of earlier published theoretical results, to receive rather constructive decisions in a number of statements of problems of management.
The article is based upon the fact that the growing demand for master data management systems has not yet produced a commonly accepted metodology for their design and development/ The article offers two mathematical models? that allow a master data management systems designer a way to formally describe their system before development and verify the system quality by measurements? unique to master data management systems.