Proceedings 2018 IEEE International Conference on Bioinformatics and Biomedicine
The IEEE BIBM 2018 promises to provide great scientific quality and to have a broad impact, with world renowned scientists as keynote speakers and invited speakers, contributed talks at a highly competitive acceptance rate, special issue publications in high-caliber scientific journals, and a broad participation of the research communities serving on the Program Committee and the organizing committees for workshops, tutorials, and posters. The scientific program highlights five themes to provide breadth, depth, and synergy for research collaboration: (1) genomics and molecular structure, function, and evolution; (2) computational systems biology; (3) medical informatics and translational bioinformatics; (4) cross-cutting computational methods and bioinformatics infrastructures, and (5) healthcare informatics, which includes approximately 20 topics.
Non-B DNA structures have a great potential to form and influence various genomic processes including transcription. One of the mechanisms of transcription regulation is nucleosome positioning. Even though only B-DNA can be wrapped around a nucleosome, non-B DNA structures can compete with a nucleosome for a genomic location. Here we used permanganate/S1 nuclease footprinting data on non-B DNA structures, such as Z-DNA, H-DNA, G-quadruplexes and stress-induced duplex destabilization (SIDD) sites, together with MNase-seq data on nucleosome positioning in the mouse genome. We found three types of patterns of nucleosome positioning around non-B DNA structures: a structure is surrounded by nucleosomes from both sides, from one side, or nucleosome free region. Machine learning models based on random forest and XGBoost algorithms were constructed to recognize DNA regions of 1kB length containing a particular pattern of nucleosome positioning for four types of DNA structures (Z-DNA, H-DNA, G-quadruplexes and SIDD sites) based on statistics of di- and tri-nucleotides. The best performance (94% of accuracy) was reached for Gquadruplexes while for other types of structures the accuracy was under 70%. We conclude that 1kB regions containing Gquadruplexes have distinct compositional properties, and this fact points to preferential locations of such pattern in the genome and requires further investigation. Gene ontology analysis revealed that the genes intersecting with the discovered patterns are enriched in channel and transmembrane activity, transcription factor and receptor binding. The direction for further research is to study the distribution of the discovered patterns in different tissues to identify well-positioned and dynamic nucleosomes and reveal genes, regulated via DNA structures and nucleosome positioning.
Networks represent a convenient model for many scientific and technological problems. From power grids to biological processes and functions, from financial networks to chemical compounds, the representation of case studies with graphs enables the possibility to highlight both topological and qualitative characteristics. In this work, we are interested in the supervised classification models for data in form of networks. Given two or more classes whose members are networks, we want to build a mathematical model to classify them. We focus on networks with labeled nodes and weighted edges. We define distances between networks and we build a classification model. We provide empirical results on datasets of biological interest providing details on graphical model selection.