• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
July 2, 2026
Researchers Discover How Spelling Errors Slow Down Reading in Russian
Psycholinguists from the Centre for Language and Brain at HSE University–St Petersburg have shown that words that are frequently misspelled are processed more slowly by readers, even when presented with the correct spelling. The researchers confirmed this effect for the first time using Russian-language materials and found that response speed is most strongly linked to how confidently individuals can distinguish the correct spelling of a word from an incorrect one. The study has been published in The Mental Lexicon.
July 2, 2026
HSE Develops App for Assessing Phonological Processing in Children
Researchers at the HSE Centre for Language and Brain have developed a new digital tool for assessing children's phonological processing skills—the ZARYA (Sound Analysis of the Russian Language) test battery. It is the first standardised application in Russia designed to provide a fast and reliable assessment of children's ability to distinguish speech sounds, retain them in working memory, and perform phonemic analysis. The app runs on Android tablets and smartphones and is available for download from RuStore. Details of the test validation have been published in the Journal of Speech, Language, and Hearing Research.
July 1, 2026
Scientists Discover Why Europium 'Misbehaves'
Europium is a rare-earth metal responsible for the pure red glow in displays and other luminescent materials. For a long time, however, it refused to emit light when surrounded by certain organic molecules known as acylpyrazolone ligands. Chemists have now uncovered the reason: in europium complexes with these ligands, a 'black window' appears—a charge-transfer state in which the energy absorbed by the ligand is dissipated as heat rather than emitted as light. Understanding this mechanism opens the way to designing more efficient red-emitting materials for displays, fluorescent thermometers, and chemical sensors. The results have been published in Dalton Transactions.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics

Journal of Proteome Research. 2015. Vol. 14. No. 8. P. 3148–3161.
Kertesz-Farkas A., Keich U., Noble W.

Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to the identified tandem mass spectra. Despite the crucial role such procedures play in most highthroughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Kall et al. Our analysis reveals signi ficant biases in the two separate target-decoy search protocols. Moreover, the one of the TDC protocol that provides an unbiased estimate FDR among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of a well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications

Priority areas: IT and mathematics
Language: English
Full text
Keywords: mass spectrometryspectrum identificationfalse discovery rate
Similar publications
Growth in noncommutative algebras and entropy in derived categories
Piontkovski D., / Series arXiv "math". 2026.
A noncommutative projective variety is defined, following Artin and Zhang, by a graded coherent algebra 𝐴. The category of coherent sheaves is then the quotient qgr(𝐴) of the category of finitely presented graded modules by the subcategory of torsion modules. We consider the categorical and polynomial entropies of the Serre twist, that is, of the ...
Added: June 23, 2026
Multilinear nilalgebras and the Jacobian theorem
Piontkovski D., / Series arXiv "math". 2025.
If a symmetric multilinear algebra is weakly nil, then it is Engel. This result may be regarded as an infinite-dimensional analogue of the well-known Jacobian theorem, which states that if a polynomial mapping has a polynomial inverse, then its Jacobian matrix is invertible. This refines a theorem of Gerstenhaber and partially answers a question posed ...
Added: June 23, 2026
ML-based Fast Simulation of FARICH Responses
Shipilov F., Barnyakov A., Ivanov A. et al., / Series Physics "arxiv.org". 2026.
A fast simulation of the detector response is a vital task in high-energy physics (HEP). Traditional Monte-Carlo methods form the backbone of modern particle physics simulation software but are computationally expensive. We present a machine-learning-based approach to fast simulation of the Focusing Aerogel Ring Imaging Cherenkov (FARICH) detector response. Given a particle track and momentum, ...
Added: May 19, 2026
Natural hazard database from Internet publications: text mining with a large language model
Derkacheva A., Sakirkina M., Kraev G. et al., /. 2026.
Comprehensive data on natural hazards and their consequences are crucial for effective for risk assessment, adaptation planning, and emergency response. However, many countries face challenges with fragmented, inconsistent, and inaccessible data, particularly regarding local-scale events. To address this data gap in Russia, we developed an end-to-end processing pipeline that scrapes news from various online sources, ...
Added: April 28, 2026
Algorithmic overlaps as thermodynamic variables: from local to cluster Monte Carlo dynamics in critical phenomena
Pilé I., Deng Y., Shchur L., / Series arXiv "math". 2026. No. 2604.10254.
We investigate the spatial overlap of successive spin configurations in Markov chain Monte Carlo simulations using the local Metropolis algorithm and the Svendsen-Wang and Wolff cluster algorithms. We examine the dynamics of these algorithms for two models in different universality classes: the Ising model and the Potts model with three components. The overlap of two ...
Added: April 20, 2026
Using predefined vector systems to speed up neural network multimillion class classification
Gabdullin N., Androsov I., / Series Computer Science "arxiv.org". 2026.
Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can ...
Added: April 2, 2026
Blood Plasma Lipid Alterations Differentiating Psychotic and Affective Disorder Patients
Petrova D., Biomolecules 2025
Added: January 18, 2026
Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection
Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.
Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...
Added: January 15, 2026
Implementing Transport Coding in OMNeT++ for Message Delay Reduction
Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.
Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer:  original packets are encoded into  coded packets, and the message is reconstructed after the first  successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...
Added: December 24, 2025
Hessian-based lightweight neural network for brain vessel segmentation on a minimal training dataset
Меньшиков И. А., Бернадотт А. К., Elvimov N. S., / Series arXie "Statistical mechanics". 2025.
Accurate segmentation of blood vessels in brain magnetic resonance angiography (MRA) is essential for successful surgical procedures, such as aneurysm repair or bypass surgery. Currently, annotation is primarily performed through manual segmentation or classical methods, such as the Frangi filter, which often lack sufficient accuracy. Neural networks have emerged as powerful tools for medical image ...
Added: December 1, 2025
Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.
Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.
The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...
Added: November 9, 2025
The Sphingolipid Asset Is Altered in the Nigrostriatal System of Mice Models of Parkinson’s Disease
Blokhin V., Shupik M., Gutner U. et al., Biomolecules 2022 Vol. 12 No. 1 Article 93
Parkinson’s disease (PD) is a neurodegenerative disease incurable due to late diagnosis and treatment. Therefore, one of the priorities of neurology is to study the mechanisms of PD pathogenesis at the preclinical and early clinical stages. Given the important role of sphingolipids in the pathogenesis of neurodegenerative diseases, we aimed to analyze the gene expression ...
Added: March 4, 2024
Reliability of maximum spanning tree identification in correlation-based market networks
V. A. Kalyagin, A. P. Koldanov, P. A. Koldanov, Physica A: Statistical Mechanics and its Applications 2022 Vol. 599 Article 127482
Maximum spanning tree is a popular tool in market network analysis. Large number of publications are devoted to the maximum spanning tree calculation and its interpretation for particular stock markets. Usually one use market data to calculate Pearson correlations between stock returns and construct a compete weighted graph, where weights of edges are given by ...
Added: May 18, 2022
Bias in False Discovery Rate Estimation in Mass-Spectrometry-Based Peptide Identification
Danilova Yulia, Voronkova Anastasia, Sulimov Pavel et al., Journal of Proteome Research 2019 Vol. 18 No. 5 P. 2354–2358
Accurate target-decoy-based false discovery rate (FDR) control of peptide identification from tandem mass-spectrometry data relies on an important but often neglected assumption that incorrect spectrum annotations are equally likely to receive either target or decoy peptides. Here we argue that this assumption is often violated in practice, even by popular methods. Preference can be given ...
Added: October 6, 2021
Annotation of tandem mass spectrometry data using stochastic neural networks in shotgun proteomics
Sulimov P., Voronkova A. V., Kertész-Farkas A., Bioinformatics 2020 Vol. 36 No. 12 P. 3781–3787
Motivation The discrimination ability of score functions to separate correct from incorrect peptide-spectrum-matches in database-searching-based spectrum identification is hindered by many superfluous peaks belonging to unexpected fragmentation ions or by the lacking peaks of anticipated fragmentation ions. Results Here, we present a new method, called BoltzMatch, to learn score functions using a particular stochastic neural networks, called restricted ...
Added: August 31, 2020
Tailor: A Nonparametric and Rapid Score Calibration Method for Database Search-Based Peptide Identification in Shotgun Proteomics
Sulimov P., Kertesz-Farkas A., Journal of Proteome Research 2020 No. 19(4) P. 1481–1490
Peptide-spectrum-match (PSM) scores used in database searching are calibrated to spectrum- or spectrum-peptide-specific null distributions. Some calibration methods rely on specific assumptions and use analytical models (e.g., binomial distributions), whereas other methods utilize exact empirical null distributions. The former may be inaccurate because of unjustified assumptions, while the latter are accurate, albeit computationally exhaustive. Here, ...
Added: June 29, 2020
ColocML: machine learning quantifies co-localization between mass spectrometry images
Ovchinnikova K., Lachlan S., Rakhlin A. et al., Bioinformatics 2020 P. 1–10
Motivation Imaging mass spectrometry (imaging MS) is a prominent technique for capturing distributions of molecules in tissue sections. Various computational methods for imaging MS rely on quantifying spatial correlations between ion images, referred to as co-localization. However, no comprehensive evaluation of co-localization measures has ever been performed; this leads to arbitrary choices and hinders method development. Results We ...
Added: March 15, 2020
A novel trityl/acridine derivatization agent for analysis of thiols by (matrix-assisted)(nanowire-assisted)laser desorption/ionization and electrospray ionization mass spectrometry
Vladimir A. Korshun, Analytical Methods 2017 Vol. 9 No. 45 P. 6335–6340
The derivatization reagent was prepared in situ by the reaction of tris(2,6-dimethoxyphenyl)methylium hexafluorophosphate with N-(2- aminoethyl)maleimide and used for the modification of a number of low molecular weight thiols. The adducts were analyzed by (MA)(NA) LDI MS and ESI MS. All registered mass spectra ((MA)(NA)LDI, ESI) revealed intense peaks of the cations of the derivatization products. The increment of the derivatization agent ...
Added: November 8, 2019
Post-translational modifications of FDA-approved plasma biomarkers in glioblastoma samples
Petushkova N., Zgoda V., Pyatnitskiy M. et al., Plos One 2017 Vol. 12 No. 5 P. 0177427-1–0177427-21
Liquid chromatography-tandem mass spectrometry was used to analyze plasma proteins of volunteers (control) and patients with glioblastoma multiform (GBM). A database search was pre-set with a variable post-translational modification (PTM): phosphorylation, acetylation or ubiquitination. There were no significant differences between the control and the GBM groups regarding the number of protein identifications, sequence coverage or number of PTMs. However, in GBM plasma, we unambiguously observed a decreased fraction in post-translationally modified peptides ...
Added: March 14, 2018
Threonine versus isothreonine in synthetic peptides analyzed by high-resolution liquid chromatography/tandem mass spectrometry
Kuznetsova K., Trufanov P., Moysa A. et al., Rapid Communications in Mass Spectrometry 2016 Vol. 30 No. 11 P. 1323–1331
One of the problems in proteogenomic research aimed at identification of variant peptides is the presence of peptides with amino acid isomers of different origin in the analyzed samples. Among the most challenging examples are peptides with threonine and isothreonine (homoserine) in their sequences. Indeed, the latter residue may appear in vitro as a methionine substitution during sample preparation for shotgun proteome analysis. Yet, this substitution of ...
Added: March 14, 2018
FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry
Palmer A., Phapale P., Chernyavsky I. et al., Nature Methods 2017 No. 14 P. 57–60
High-mass-resolution imaging mass spectrometry promises to localize hundreds of metabolites in tissues, cell cultures, and agar plates with cellular resolution, but it is hampered by the lack of bioinformatics tools for automated metabolite identification. We report pySM, a framework for false discovery rate (FDR)-controlled metabolite annotation at the level of the molecular sum formula, for ...
Added: February 7, 2017
Database searching in mass spectrometry based proteomics
Kertesz-Farkas A., Myers M. P., Current Bioinformatics 2012 Vol. 7 No. 2 P. 221–230
Bottom-up proteomics (mass spectrometry analysis of peptides obtained by proteolysis and separated by liquid chromatography, (LCMS/MS)) is one of the most frequently used techniques for identifying and characterizing proteins in biological samples. A key element of the analysis is database searching when the mass spectra of the peptides are compared with a database of theoretically ...
Added: November 18, 2015
Crux: rapid open source protein tandem mass spectrometry analysis
Kertesz-Farkas A., Grant C. E., Howbert J. J. et al., Journal of Proteome Research 2014 Vol. 13 No. 10 P. 4488–4491
Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a crossplatform suite of analysis tools for interpreting protein mass spectrometry data. ...
Added: November 18, 2015
Precursor mass dependent filtering of mass spectra for proteomics analysis
Kertesz-Farkas A., Myers M. P., Protein and peptide letters 2014 Vol. 21 No. 8 P. 858–863
Identification and elimination of noise peaks in mass spectra from large proteomics data streams simultaneously improves the accuracy of peptide identification and significantly decreases the size of the data. There are a number of peak filtering strategies that can achieve this goal. Here we present a simple algorithm wherein the number of highest intensity peaks ...
Added: November 18, 2015
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit