Kalium 2.0, a comprehensive database of polypeptide ligands of potassium channels
Potassium channels are the most diverse group of ion channels in humans. They take vital parts in numerous physiological processes and their malfunction gives rise to a range of pathologies. In addition to small molecules, there is a wide selection of several hundred polypeptide ligands binding to potassium channels, the majority of which have been isolated from animal venoms. Until recently, only scorpion toxins received focused attention being systematically assembled in the manually curated Kalium database, but there is a diversity of well-characterized potassium channel ligands originating from other sources. To address this issue, here we present the updated and improved Kalium 2.0 that covers virtually all known polypeptide ligands of potassium channels and reviews all available pharmacological data. In addition to an expansion, we have introduced several new features to the database including posttranslational modification annotation, indication of ligand mode of action, BLAST search, and possibility of data export.
High performance querying and ad-hoc querying are commonly viewed as mutually exclusive goals in massively parallel processing databases. In the one extreme, a database can be set up to provide the results of a single known query so that the use of available of resources are maximized and response time minimized, but at the cost of all other queries being suboptimally executed. In the other extreme, when no query is known in advance, the database must provide the information without such optimization, normally resulting in inefficient execution of all queries. This paper introduces a novel technique, highly normalized Big Data using Anchor modeling, that provides a very efficient way to store information and utilize resources, thereby providing ad-hoc querying with high performance for the first time in massively parallel processing databases. A case study of how this approach is used for a Data Warehouse at Avito over two years time, with estimates for and results of real data experiments carried out in HP Vertica, an MPP RDBMS, are also presented.
Radio Observatory of Lebedev Physical Institute in Pushchino has one of the most sensitive radio telescope at 110 MHz BSA (Big Scanning Antenna). Since 2012 BSA started multi-beams observations using 96 beams in declination from -8 up to +42 degrees in 6 and 32 frequency bands at 109-112 MHz and time sampling 0.1 s and 0.0125 s. The data stream in 32 bands and time sampling of 0.0125 s is producing 87.5 gigabytes per day. The data obtained can be used for short and long-term monitoring of various classes of radio sources (including radio transients), the Earth's ionosphere, interplanetary and interstellar plasma monitoring. A database is constructed to facilitate access to a large amount of observational data (see http://astro.prao.ru/cgi/out_img.cgi). We discuss algorithms of detection and identification of different classes of transients using the database. In particular we found 83096 events which could be associated with pulsars, scintillation sources and fast radio transients. These events are a homogenous sample suitable for statistical analysis.
This paper describes and analyses optimization approaches, which make possible the exact calculation of millions of hierarchical count distinct measures over hundreds of billions data rows. Described approach evolved for several years, in parallel with the growth of tasks from a fast growing internet company, and was finally implemented as a PEAPM (Pipelined Exact Accumulation for Paralleled Measures) algorithm. Current version of an algorithm outputs exact values (not estimates), works in a single thread, in minutes using a general commodity hardware, and requires volume of RAM equal to the doubled size of required measures/
High performance querying and ad-hoc querying are commonly viewed as mutually exclusive goals in massively parallel processing databases. Furthermore, there is a contradiction between ease of extending the data model and ease of analysis. The modern 'Data Lake' approach, promises extreme ease of adding new data to a data model, however it is prone to eventually becoming a Data Swamp - unstructured, ungoverned, and out of control Data Lake where due to a lack of process, standards and governance, data is hard to find, hard to use and is consumed out of context. This paper introduces a novel technique, highly normalized Big Data using Anchor modeling, that provides a very efficient way to store information and utilize resources, thereby providing ad-hoc querying with high performance for the first time in massively parallel processing databases. This technique is almost as convenient for expanding data model as a Data Lake, while it is internally protected from transforming to Data Swamp. A case study of how this approach is used for a Data Warehouse at Avito over a three-year period, with estimates for and results of real data experiments carried out in HP Vertica, an MPP RDBMS, is also presented. This paper is an extension of theses from The 34th International Conference on Conceptual Modeling (ER 2015) (Golov and Rönnbäck 2015) , it is complemented with numerical results about key operating areas of highly normalized big data warehouse, collected over several (1-3) years of commercial operation. Also, the limitations, imposed by using a single MPP database cluster, are described, and cluster fragmentation approach is proposed.
One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been improved algorithms for utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in the assembly graph structure. However, the PDBG approach faces difficulties when the variation in the insert sizes is high. To address this problem, we first transform mate-pairs into edge-pair histograms that allow one to better estimate the distance between edges in the assembly graph that represent regions linked by multiple mate-pairs. Further, we combine the ideas of mate-pair transformation and PDBGs to construct new data structures for genome assembly: pathsets and pathset graphs.
Papers about natural protection territories
Many environmental stimuli present a quasi-rhythmic structure at different timescales that the brain needs to decompose and integrate. Cortical oscillations have been proposed as instruments of sensory de-multiplexing, i.e., the parallel processing of different frequency streams in sensory signals. Yet their causal role in such a process has never been demonstrated. Here, we used a neural microcircuit model to address whether coupled theta–gamma oscillations, as observed in human auditory cortex, could underpin the multiscale sensory analysis of speech. We show that, in continuous speech, theta oscillations can flexibly track the syllabic rhythm and temporally organize the phoneme-level response of gamma neurons into a code that enables syllable identification. The tracking of slow speech fluctuations by theta oscillations, and its coupling to gamma-spiking activity both appeared as critical features for accurate speech encoding. These results demonstrate that cortical oscillations can be a key instrument of speech de-multiplexing, parsing, and encoding.
Hypoxia of trophoblast cells is an important regulator of normal development of the placenta. However, some pathological states associated with hypoxia, e.g. preeclampsia, impair the functions of placental cells. Oxyquinoline derivative inhibits HIF-prolyl hydroxylase by stabilizing HIF-1 transcription complex, thus modeling cell response to hypoxia. In human choriocarcinoma cells BeWo b30 (trophoblast model), oxyquinoline increased the expression of a core hypoxia response genes along with up-regulation of NOS3, PDK1, and BNIP3 genes and down-regulation of the PPARGC1B gene. These changes in the expression profile attest to activation of the metabolic cell reprogramming mechanisms aimed at reducing oxygen consumption by enabling the switch from aerobic to anaerobic glucose metabolism and the respective decrease in number of mitochondria. The possibility of practical use of the therapeutic properties of oxyquinoline derivatives is discussed.