Empirical multi-dimensional space for scoring peptide spectrum matches in shotgun proteomics

Ivanov, Mark V.; Levitsky, Lev I.; Lobas, Anna A.; Panic, Tanja; Laskay, Ünige A.; Mitulovic, Goran; Schmid, Rainer; Pridatchenko, Marina L.; Tsybin, Yury O.; Gorshkov, Mikhail V.

doi:10.1021/pr401026y

Ivanov, Mark V.; Levitsky, Lev I.; Lobas, Anna A.; Panic, Tanja; Laskay, Ünige A.; Mitulovic, Goran; Schmid, Rainer; Pridatchenko, Marina L.; Tsybin, Yury O.; Gorshkov, Mikhail V.

2014

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Data-dependent tandem mass spectrometry (MS/MS) is one of the main techniques for protein identification in shotgun proteomics. In a typical LC MS/MS workflow, peptide product ion mass spectra (MS/MS spectra) are compared with those derived theoretically from a protein sequence database. Scoring of these matches results in peptide identifications. A set of peptide identifications is characterized by false discovery rate (FDR), which determines the fraction of false identifications in the set. The total number of peptides targeted for fragmentation is in the range of 10 000 to 20 000 for a several-hour LC MS/MS run. Typically, <50% of these MS/MS spectra result in peptide-spectrum matches go (PSMs). A small fraction of PSMs pass the preset FDR level (commonly 1%) giving a list of identified proteins, yet a large number of correct PSMs corresponding to the peptides originally present in the sample are left behind in the "grey area" below the identity threshold. Following the numerous efforts to recover these correct PSMs, here we investigate the utility of a scoring scheme based on the multiple PSM descriptors available from the experimental data. These descriptors include retention time, deviation between experimental and theoretical mass, number of missed cleavages upon in-solution protein digestion, precursor ion fraction (PIF), PSM count per sequence, potential modifications, median fragment mass error, C-13 isotope mass difference, charge states, and number of PSMs per protein. The proposed scheme utilizes a set of metrics obtained for the corresponding distributions of each of the descriptors. We found that the proposed PSM scoring algorithm differentiates equally or more efficiently between correct and incorrect identifications compared with existing postsearch validation approaches.

Details

Title Empirical multi-dimensional space for scoring peptide spectrum matches in shotgun proteomics

Author(s) Ivanov, Mark V. ; Levitsky, Lev I. ; Lobas, Anna A. ; Panic, Tanja ; Laskay, Ünige A. ; Mitulovic, Goran ; Schmid, Rainer ; Pridatchenko, Marina L. ; Tsybin, Yury O. ; Gorshkov, Mikhail V.

Published in Journal of Proteome Research

Pagination 10

Pages 140227054632005

Date 2014

Publisher Washington, Amer Chemical Soc

ISSN 1535-3907

Keywords

proteomics; tandem mass spectrometry; peptide identification; false discovery rate; peptide-spectrum matches

DOI https://doi.org/10.1021/pr401026y

Other identifier(s) View record in Web of Science

Laboratories LSMB

Record Appears in Scientific production and competences > SB - School of Basic Sciences > SB Archives > LSMB - Biomolecular Mass Spectrometry Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2014-03-10

Abstract

Details

Actions