?
Database searching in mass spectrometry based proteomics
Current Bioinformatics. 2012. Vol. 7. No. 2. P. 221–230.
Kertesz-Farkas A., Myers M. P.
Bottom-up proteomics (mass spectrometry analysis of peptides obtained by proteolysis and separated by liquid chromatography, (LCMS/MS)) is one of the most frequently used techniques for identifying and characterizing proteins in biological samples. A key element of the analysis is database searching when the mass spectra of the peptides are compared with a database of theoretically computed (or experimental) peptide spectra. Here we discuss the main computational approaches to spectrum database searching and the statistical analysis of the results
Petrova D., Biomolecules 2025
Added: January 18, 2026
Kertesz-Farkas A., Acquaye F. L., Ostapenko V. et al., Journal of Proteome Research 2025 Vol. 24 No. 9 P. 4831–4837
Over the past 30 years, software for searching tandem mass spectrometry data against a protein database has improved dramatically in speed and statistical power. However, existing tools can still struggle to analyze truly massive data sets when either the number of spectra or the number of proteins being analyzed grows too large. Here, we describe ...
Added: September 11, 2025
Blokhin V., Shupik M., Gutner U. et al., Biomolecules 2022 Vol. 12 No. 1 Article 93
Parkinson’s disease (PD) is a neurodegenerative disease incurable due to late diagnosis and treatment. Therefore, one of the priorities of neurology is to study the mechanisms of PD pathogenesis at the preclinical and early clinical stages. Given the important role of sphingolipids in the pathogenesis of neurodegenerative diseases, we aimed to analyze the gene expression ...
Added: March 4, 2024
Kazakova E., Solovyeva E., Levitsky L. et al., Proteomics 2023 Vol. 53 No. 5 Article 2200275
Omics technologies focus on uncovering the complex nature of molecular mechanisms in cells and organisms, including biomarkers and drug targets discovery. Aiming at these tasks, we see that information extracted from omics data is still underused. In particular, characteristics of differentially regulated molecules can be combined in a single score to quantify the signaling pathway ...
Added: September 14, 2023
Kertesz-Farkas A., Acquaye F. L., Kishankumar Bhimani et al., Journal of Proteome Research 2023 Vol. 22 No. 2 P. 561–569
The Crux tandem mass spectrometry data analysis toolkit provides a collection of algorithms for analyzing bottom-up proteomics tandem mass spectrometry data. Many publications have described various individual components of Crux, but a comprehensive summary has not been published since 2014. The goal of this work is to summarize the functionality of Crux, focusing on developments ...
Added: December 2, 2022
Sabatier P., Beusch C., Maltseva D. et al., Nature Communications 2021 No. 12 Article 6558
Detailed characterization of cell type transitions is essential for cell biology in general and particularly for the development of stem cell-based therapies in regenerative medicine. To systematically study such transitions, we introduce a method that simultaneously measures protein expression and thermal stability changes in cells and provide the web-based visualization tool ProteoTracker. We apply our ...
Added: November 12, 2021
Danilova Yulia, Voronkova Anastasia, Sulimov Pavel et al., Journal of Proteome Research 2019 Vol. 18 No. 5 P. 2354–2358
Accurate target-decoy-based false discovery rate (FDR) control of peptide identification from tandem mass-spectrometry data relies on an important but often neglected assumption that incorrect spectrum annotations are equally likely to receive either target or decoy peptides. Here we argue that this assumption is often violated in practice, even by popular methods. Preference can be given ...
Added: October 6, 2021
Sulimov P., Kertesz-Farkas A., Journal of Proteome Research 2020 No. 19(4) P. 1481–1490
Peptide-spectrum-match (PSM) scores used in database searching are calibrated to spectrum- or spectrum-peptide-specific null distributions. Some calibration methods rely on specific assumptions and use analytical models (e.g., binomial distributions), whereas other methods utilize exact empirical null distributions. The former may be inaccurate because of unjustified assumptions, while the latter are accurate, albeit computationally exhaustive. Here, ...
Added: June 29, 2020
Ovchinnikova K., Lachlan S., Rakhlin A. et al., Bioinformatics 2020 P. 1–10
Motivation
Imaging mass spectrometry (imaging MS) is a prominent technique for capturing distributions of molecules in tissue sections. Various computational methods for imaging MS rely on quantifying spatial correlations between ion images, referred to as co-localization. However, no comprehensive evaluation of co-localization measures has ever been performed; this leads to arbitrary choices and hinders method development.
Results
We ...
Added: March 15, 2020
Vladimir A. Korshun, Analytical Methods 2017 Vol. 9 No. 45 P. 6335–6340
The derivatization reagent was prepared in situ by the reaction of tris(2,6-dimethoxyphenyl)methylium hexafluorophosphate with N-(2- aminoethyl)maleimide and used for the modification of a number of low molecular weight thiols. The adducts were analyzed by (MA)(NA) LDI MS and ESI MS. All registered mass spectra ((MA)(NA)LDI, ESI) revealed intense peaks of the cations of the derivatization products. The increment of the derivatization agent ...
Added: November 8, 2019
Kopylov A., Ponomarenko E., Ilgisonis E. et al., Journal of Proteome Research 2019 Vol. 18 No. 1 P. 120–129
This work continues the series of the quantitative measurements of the proteins encoded by different chromosomes in the blood plasma of a healthy person. Selected Reaction Monitoring with Stable Isotope-labeled peptide Standards (SRM SIS) and a gene-centric approach, which is the basis for the implementation of the international Chromosome-centric Human Proteome Project (C-HPP), were applied ...
Added: October 7, 2019
Kuznetsova K., Ivanov M., Pyatnitskiy M. et al., Biochemistry. Biokhimiia 2019 Vol. 84 No. 1 P. 71–78
The brain proteome of Drosophila melanogaster was characterized by liquid chromatography/high-resolution mass spectrometry and compared to the earlier characterized Drosophila whole-body and head proteomes. Raw data for all the proteomes were processed in a similar manner. Approximately 4000 proteins were identified in the brain proteome that represented, as expected, the subsets of the head and ...
Added: October 7, 2019
Petushkova N., Zgoda V., Pyatnitskiy M. et al., Plos One 2017 Vol. 12 No. 5 P. 0177427-1–0177427-21
Liquid chromatography-tandem mass spectrometry was used to analyze plasma proteins of volunteers (control) and patients with glioblastoma multiform (GBM). A database search was pre-set with a variable post-translational modification (PTM): phosphorylation, acetylation or ubiquitination. There were no significant differences between the control and the GBM groups regarding the number of protein identifications, sequence coverage or number of PTMs. However, in GBM plasma, we unambiguously observed a decreased fraction in post-translationally modified peptides ...
Added: March 14, 2018
Ponomarenko E., Poverennaya E., Ilgisonis E. et al., International Journal of Analytical Chemistry 2016 P. 1–6
This work discusses bioinformatics and experimental approaches to explore the human proteome, a constellation of proteins expressed in different tissues and organs. As the human proteome is not a static entity, it seems necessary to estimate the number of different protein species (proteoforms) and measure the number of copies of the same protein in a ...
Added: March 14, 2018
Kliuchnikova A., Samokhina N., Ilina I. et al., Proteomics 2016 Vol. 16 No. 13 P. 1938–1946
Twenty-nine human aqueous humor samples from patients with eye diseases such as cataract and glaucoma with and without pseudoexfoliation syndrome were characterized by LC-high resolution MS analysis. In total, 269 protein groups were identified with 1% false discovery rate including 32 groups that were not reported previously for this biological fluid. Since the samples were analyzed individually, but not pooled, 36 proteins were identified ...
Added: March 14, 2018
Kuznetsova K., Trufanov P., Moysa A. et al., Rapid Communications in Mass Spectrometry 2016 Vol. 30 No. 11 P. 1323–1331
One of the problems in proteogenomic research aimed at identification of variant peptides is the presence of peptides with amino acid isomers of different origin in the analyzed samples. Among the most challenging examples are peptides with threonine and isothreonine (homoserine) in their sequences. Indeed, the latter residue may appear in vitro as a methionine substitution during sample preparation for shotgun proteome analysis. Yet, this substitution of ...
Added: March 14, 2018
Palmer A., Phapale P., Chernyavsky I. et al., Nature Methods 2017 No. 14 P. 57–60
High-mass-resolution imaging mass spectrometry promises to localize hundreds of metabolites in tissues, cell cultures, and agar plates with cellular resolution, but it is hampered by the lack of bioinformatics tools for automated metabolite identification. We report pySM, a framework for false discovery rate (FDR)-controlled metabolite annotation at the level of the molecular sum formula, for ...
Added: February 7, 2017
Kertesz-Farkas A., Grant C. E., Howbert J. J. et al., Journal of Proteome Research 2014 Vol. 13 No. 10 P. 4488–4491
Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a crossplatform suite of analysis tools for interpreting protein mass spectrometry data. ...
Added: November 18, 2015
Kertesz-Farkas A., Myers M. P., Protein and peptide letters 2014 Vol. 21 No. 8 P. 858–863
Identification and elimination of noise peaks in mass spectra from large proteomics data streams simultaneously improves the accuracy of peptide identification and significantly decreases the size of the data. There are a number of peak filtering strategies that can achieve this goal. Here we present a simple algorithm wherein the number of highest intensity peaks ...
Added: November 18, 2015
Kertesz-Farkas A., Myers M. P., Current Bioinformatics 2012 Vol. 7 No. 2 P. 212–220
Mass spectrometry based proteomics analysis can produce many thousands of spectra in a single experiment, and much of this data, frequently greater than 50%, cannot be properly evaluated computationally. Therefore a number of strategies have been developed to aid the processing of mass spectra and typically focus on the identification and elimination of noise, which ...
Added: November 18, 2015