Efficient indexing of peptides for database search using Tide

F. L. Acquaye; A. Kertesz-Farkas; Stafford Noble W.

doi:10.1021/acs.jproteome.2c00617

Publications

?

Efficient indexing of peptides for database search using Tide

Journal of Proteome Research. 2023. Vol. 22. No. 2. P. 577-584.

Acquaye F. L., Kertesz-Farkas A., Stafford Noble W.

The first step in the analysis of protein tandem mass spectrometry data typically involves searching the observed spectra against a protein database. During database search, the search engine must digest the proteins in the database into peptides, subject to digestion rules that are under user control. The choice of these digestion parameters, as well as selection of post-translational modifications (PTMs), can dramatically affect the size of the search space and hence the statistical power of the search. The Tide search engine separates the creation of the peptide index from the database search step, thereby saving time by allowing a peptide index to be reused in multiple searches. Here we describe an improved implementation of the indexing component of Tide that consumes around four times less resources (CPU and RAM) than the previous version and can generate arbitrarily large peptide databases, limited by only the amount of available disk space. We use this improved implementation to explore the relationship between database size and the parameters controlling digestion and PTMs, as well as database size and statistical power. Our results can help guide practitioners in proper selection of these important parameters.

Research target: Computer Science Medical and Health Sciences

Language: English

DOI

Keywords: applied data analysis tandem mass spectrometry

Publication based on the results of:

Еnd-to-end learning for spectrometry data annotation (2023)

The Crux toolkit for analysis of bottom-up tandem mass spectrometry proteomics data

Kertesz-Farkas A., Acquaye F. L., Kishankumar Bhimani et al., Journal of Proteome Research 2023 Vol. 22 No. 2 P. 561-569

The Crux tandem mass spectrometry data analysis toolkit provides a collection of algorithms for analyzing bottom-up proteomics tandem mass spectrometry data. Many publications have described various individual components of Crux, but a comprehensive summary has not been published since 2014. The goal of this work is to summarize the functionality of Crux, focusing on developments ...

Added: December 2, 2022

Real-time low latency estimation of brain rhythms with deep neural networks

Ilia Semenkov, Nikita Fedosov, Makarov I. et al., Journal of Neural Engineering 2023 Vol. 20 No. 5 Article 056008

Objective. Neurofeedback and brain-computer interfacing technology open the exciting opportunity for establishing interactive closed-loop real-time communication with the human brain. This requires interpreting brain's rhythmic activity and generating timely feedback to the brain. Lower delay between neuronal events and the appropriate feedback increase the efficacy of such interaction. Novel more efficient approaches capable of tracking brain ...

Added: September 9, 2023

2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine (CSGB), 28-30 Sept. 2023

IEEE, 2023

The conference will provide an interdisciplinary platform for discussing the problems of cognitive science, genomics, bioinformatics and bioengineering, data processing and computer modeling in medicine & biology by scientists of various profiles, including neurobiologists, geneticists, pharmacologists, researchers of computational and experimental biomedicine. ...

Added: December 6, 2023

Биомаркеры предсердной кардиопатии у пациентов с разными патогенетическими подтипами ишемического инсульта

Мехряков С. А., Кулеш А. А., Сыромятникова Л. И. et al., Неврология, нейропсихиатрия, психосоматика 2020 Т. 12 № 6 С. 33-41

Studies of the biomarkers of atrial cardiopathy seem to be promising for identifying patients with cryptogenic stroke (CS), in which an intensive search for atrial fibrillation is indicated. Nevertheless, the diagnostic value of these markers and their threshold values require clarification. Objective: to present the characteristics of echocardiographic markers for atrial cardiopathy and the serum concentration ...

Added: December 14, 2020

Structural Model of a Training Computer Program for Improving Professional Skills of a Student in a Role of a District Polyclinic Physician

Bulatov S., Kharisova E., Lavrenov R. et al., Journal of Robotics, Networking and Artificial Life 2021 Vol. 8 No. 2 P. 122-126

An important feature of a medical education in the context of the COVID-19 pandemic is a migration of classes into an online or mixed format, which requires simulated-based teaching methods with elements of robotics and artificial intelligence. We analyzed computer programs for maintaining medical records of patients that are employed by various polyclinics of Kazan ...

Added: November 9, 2021

Cortical and autonomic responses during staged Taoist meditation: Two distinct meditation strategies

Volodina M., Smetanin N., Lebedev M. et al., Plos One 2021 Vol. 16 No. 12 Article e0260626

Meditation is a consciousness state associated with specific physiological and neural correlates. Numerous investigations of these correlates reported controversial results which prevented a consistent depiction of the underlying neurophysiological processes. Here we investigated the dynamics of multiple neurophysiological indicators during a staged meditation session. We measured the physiological changes at rest and during the guided Taoist meditation in experienced meditators ...

Added: January 24, 2022

Proceedings of the first Workshop on Data Analysis in Medicine (WDAM-2017)

EasyChair, 2018

This volume contains proceedings of the first Workshop on Data Analysis in Medicine held in May 2017 at the National Research University Higher School of Economics, Moscow. The volume contains one invited paper by Dr. Svetla Boytcheva, 6 regular contributions and 2 project proposals, carefully selected and reviewed by at least two reviewers from the ...

Added: June 8, 2018

Heart rate response to cognitive load as a marker of depression and increased anxiety

Evgeniia I. Alshanskaia, Natalia A. Zhozhikashvili, Irina S. Polikanova et al., Frontiers in Psychiatry 2024 Vol. 15 Article 1355846

Introduction: Understanding the interplay between cardiovascular parameters, cognitive stress induced by increasing load, and mental well-being is vital for the development of integrated health strategies today. By monitoring physiological signals like electrocardiogram (ECG) and photoplethysmogram (PPG) in real time, researchers can discover how cognitive tasks influence both cardiovascular and mental health. Cardiac biomarkers resulting from ...

Added: July 1, 2024

Machine-Learning for electro-magnetic showers reconstruction in emulsion cloud chambers

V.Belavin, A.Filatov, A.Ustyuzhanin et al., Journal of Physics: Conference Series 2018 Vol. 1085 No. 4 P. 042025-1-042025-6

Traces of electro-magnetic showers in the neutrino experiments may be considered as signals of dark-matter particles. For example, SHiP experiment is going to use emulsion film detectors similar to the ones designed for OPERA experiment from dark matter search. The goal of this research is to develop an algorithm that can identify traces of electro-magnetic ...

Added: December 8, 2017

Классификация возраста в судебной медицине с использованием методов машинного обучения

Золотенкова Г. В., Rogachev A., Пиголкин Ю. И. et al., Современные технологии в медицине 2022 Т. 14 № 1 С. 15-24

The aim of the study was to assess the capabilities of age determination (age group) at death using classification techniques by histomorphometric characteristics of osseous and cartilaginous tissue aging. Materials and Methods. The study material was a database containing the findings of morphometric researches of osseous and cartilaginous tissue histologic specimens from 294 categorized male corpses ...

Added: May 25, 2022

VGsim: scalable viral genealogy simulator for global pandemic

Shchur V., Spirin V., Pokrovskiy V. et al., / Cold Spring Harbor Laboratory. Series 005140 "Medrxiv". 2021.

As an effort to help contain the COVID-19 pandemic, large numbers of SARS-CoV-2 genomes have been sequenced from all continents. More than one million viral sequences are publicly available as of April 2021. Many studies estimate viral genealogies from these sequences, as these can provide valuable information about the spread of the pandemic across time ...

Added: April 27, 2021

The rise and spread of the SARS-CoV-2 AY.122 lineage in Russia

Klink G., Safina K., Nabieva E. et al., Virus Evolution 2022 Vol. 8 No. 1 Article veac017

Delta has outcompeted most preexisting variants of SARS-CoV-2, becoming the globally predominant lineage by mid-2021. Its subsequent evolution has led to the emergence of multiple sublineages, most of which are well-mixed between countries. By contrast, here we show that nearly the entire Delta epidemic in Russia has probably descended from a single import event, or ...

Added: June 4, 2022

12th International Symposium on Computer Science in Sport. Book of Abstracts

M. : Государственное казенное учреждение города Москвы "Центр спортивных инновационных технологий и подготовки сборных команд" Департамента физической культуры и спорта города Москвы, 2019

The 12th International Symposium of Computer Science in Sports (IACSS 2019), took place July 8-10, 2019 at Marchuk Institute of Numerical Mathematics of the Russian Academy of Science and the Moscow Center of Advanced Sports Technologies (MCAST), both situated in Moscow, Russia. The symposium continued a tradition of conferences starting in 1997 at Cologne, Germany, ...

Added: January 14, 2020

Advances in Intelligent Systems and Computing

Switzerland : Springer, 2019

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent ...

Added: January 9, 2019

Optimal vaccine allocation during the mumps outbreak in two SIR centres

Chernov A., Kelbert M., Shemendyuk A., Mathematical Medicine and Biology 2020 Vol. 37 No. 3 P. 303-312

The aim of this work is to investigate the optimal vaccine sharing between two SIR centres in the presence of migration fluxes of susceptibles and infected individuals during the mumps outbreak. Optimality of the vaccine allocation means the minimisation of the total number of lost working days during the whole period of epidemic outbreak $[0,t_f]$, ...

Added: May 3, 2019

Obesity and individual performance: the case of eSports

Parshakov P., Iuliia Naidenova, Assanskiy A. et al., International Journal of Obesity 2022 Vol. 46 P. 1518-1526

Background/Objectives The study considers the problem of the inclusion of people with obesity in the context of the growing role of computer-based work. Negative stereotypes about people with obesity still hold even when they are irrelevant in tasks that require little physical activity. Subjects/Methods Using data from the realm of competitive video gaming (eSports) and image recognition-based metric ...

Added: May 31, 2022

Приоритет 2030, 4 том доклада. «Глобальный ландшафт исследований и перспективных разработок в области укрепления человека».

Alshanskaia Sokolova E. I., Martynova O., Ivanov I. et al., Электронное издательство «Эгитас», 2022

Здоровье всегда являлось весомым капиталом человека, а грамотное распоряжением им всегда приносило ощутимые инвестиции. В современном мире с его темпами, задачами и вызовами у большинства людей все меньше времени остается на сохранение, поддержание и тем более укрепление здоровья. Но на возникающий запрос, формируется решение, продиктованное самим обществом, а именно, его цифровым отражением. Оцифровка общественной и ...

Added: May 15, 2023

Studies in Big Data

Switzerland : Springer, 2019

Data management and analysis is one of the fastest growing and most challenging areas of research and development in both academia and industry. Numerous types of applications and services have been studied and re-examined in this field resulting in this edited volume which includes chapters on effective approaches for dealing with the inherent complexity within ...

Added: December 29, 2019

Applied Data Analysis in Energy Monitoring System

Kychkin A.V., Mikriukov G. P., Проблемы региональной энергетики 2016 Vol. 2 No. 31 P. 84-92

Software and hardware system organization is presented as an example for building energy monitoring of multi-sectional lighting and climate control / conditioning needs. System key feature is applied office energy data analysis that allows to provide each type of hardware localized work mode recognition. It based on general energy consumption profile with following energy consumption ...

Added: November 21, 2017

Научно-образовательный энциклопедический портал «Знания»

Автономная некоммерческая организация «Национальный научно-образовательный центр «Большая российская энциклопедия» (АНО БРЭ), 2022

Основными задачами Портала являются создание и поддержание в актуальном состоянии электронной национальной базы знаний на русском языке, консолидирующей сведения об окружающем мире, предназначенной для широкого круга пользователей, а также популяризация науки, трансфер научных результатов в образование, культура использования данных и научные междисциплинарные коммуникации. Важнейшая социальная функция Портала — доступность правдивой научной информации и формирование доверия пользователя к ней. Контент портала Основу контента составляют дополненные и расширенные статьи «Большой ...

Added: September 12, 2022

An ontology-based approach to the analysis of the acid-base state of patients at operative measures

Tianxing M., Lushnov M., Ignatov D. I. et al., PeerJ Computer Science 2021 No. 7 Article e777

Researchers working in various domains are focusing on extracting information from data sets by data mining techniques. However, data mining is a complicated task, including multiple complex processes, so that it is unfriendly to non-computer researchers. Due to the lack of experience, they cannot design suitable workflows that lead to satisfactory results. This article proposes ...

Added: December 14, 2021

SARS-CoV-2 Omicron Outbreak in a Dormitory in Saint-Petersburg, Russia

Bazykin G., Danilenko D., Komissarov A. et al., / Research Square Company. Series ResearchSquare "Research Square". 2022.

Added: January 13, 2022

Proceedings of 11th Moscow Conference on Computational Molecular Biology MCCMB'23

IITP RAS, 2023

В сборнике представлены тезисы работ участников 11-ой Московской конференции по вычислительной молекулярной биологии MCCMB'23. Работы посвящены актуальным вопросам анализа аминокислотных и нуклеотидных последовательностей, структур биополимеров, молекулярной эволюции, методов высокопроизводительного секвенирования, системной биологии и биоалгоритмов. ...

Added: November 30, 2023

Genomic epidemiology of the early stages of SARS-CoV-2 outbreak in Russia

Komissarov A. B., Safina K. R., Garushyants S. K. et al., / Cold Spring Harbor Laboratory. Series 005140 "Medrxiv". 2020.

The ongoing pandemic of SARS-CoV-2 presents novel challenges and opportunities for the use of phylogenetics to understand and control its spread. Here, we analyze the emergence of SARS-CoV-2 in Russia in March and April 2020. Combining phylogeographic analysis with travel history data, we estimate that the sampled viral diversity has originated from 67 closely timed ...

Added: July 17, 2020