Fuzzy classification and fast rejection rules in the structure-property problem
A new approach to analysis of the molecule–descriptor matrix in the structure–property problem,based on the fuzzy cluster structure of the training sample, is developed. Methods for constructing fast pre diction rejection rules and for the search the outliers in a training sample are described. To that end, a special space ofeasily computed descriptors is introduced. Optimization of the classifying function with respect to the param eters of fuzzy classification is considered. Prognostic models with a high quality of prediction, based on thisapproach, are proposed. Comparison of models is performed, which shows the efficiency of the describedmethods
3D-QSAR and molecular docking were applied to predict inhibitory activity of 196 compounds towards poly-(ADP-riboso)-polymerase-1 (PARP). Proportion of experimentally active ligands was higher among compounds with good rankings from both methods (57%) compared to compounds scored as inactive by at least one method (40% for docking-active, QSAR-inactive compounds).
This paper presents a clustering algorithm, namely MFWK-Means, which is a novel extension of K-Means clustering to the case of fuzzy clusters and weighted features. First, the Weighted K-Means criterion utilizing Minkowski metric is adopted to solve the problem of feature selection for high dimensional data. Then, a further extension to the case of fuzzy clustering is presented to group datasets with natural fuzziness of cluster boundaries. Also, we adopt an intelligent version of K-Means, using Mirkin’s method of Anomalous Pattern for initialization. Our new Minkowski metric Fuzzy Weighted K-Means (MFWK-Means) is experimentally validated on both benchmark datasets and synthetic datasets. MFWK-Means is shown to be competitive and more stable against noise in comparison with a variety of versions of K-Means based methods. Moreover, in most situations it reaches the highest clustering accuracy at wider intervals of Minkowski exponent.
Proposed and developed a method for solving the “structure property” problem, which is based on an adaptive choice of the description of molecules and the automatic selection of feature space in accordance with the characteristics of the training sample. Solved the problem of combinatorial explosion using Group Method of Data Handling. Used the clustering of objects in the training set to improve the predictive ability of the model.
This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.