Recognition of DNA Secondary Structures as Nucleosome Barriers with Deep Learning Methods
Over the past few years, genome research using machine and in-depth learning techniques has become increasingly popular, and researchers are being provided with sophisticated data analysis tools. Recognition of patterns of DNA secondary structures and genomic functional elements are still poorly investigated, despite the fact that research in this area has the potential to contribute greatly to the development of medicine and pharmacology. This study aims to explore machine and deep learning methods that have proven to be successful in natural language processing with respect to the task of DNA sequence recognition. Two deep learning models based on CNN and LSTM architectures were developed. Each model was tested on multiple classification tasks for recognition of DNA sequences containing quadruplexes with potential function of nucleosome barriers. Additionally, model interpretation analysis was performed in the form of extraction of CNN significant filters and their transformation into DNA-motifs.