Discovering structural alerts for mutagenicity using stable emerging molecular patterns
This study is dedicated to the introduction of a novel method that automatically extracts potential structural alerts from a data set of molecules. These triggering structures can be further used for knowledge discovery and classification purposes. Computation of the structural alerts results from an implementation of a sophisticated workflow that integrates a graph mining tool guided by growth rate and stability. The growth rate is a well-established measurement of contrast between classes. Moreover, the extracted patterns correspond to formal concepts; the most robust patterns, named the stable emerging patterns (SEPs), can then be identified thanks to their stability, a new notion originating from the domain of formal concept analysis. All of these elements are explained in the paper from the point of view of computation. The method was applied to a molecular data set on mutagenicity. The experimental results demonstrate its efficiency: it automatically outputs a manageable number of structural patterns that are strongly related to mutagenicity. Moreover, a part of the resulting structures corresponds to already known structural alerts. Finally, an in-depth chemical analysis relying on these structures demonstrates how the method can initiate promising processes of chemical knowledge discovery. © 2015 American Chemical Society.