iProphet: Multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates

TitleiProphet: Multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates
Publication TypeJournal Article
Year of Publication2011
AuthorsShteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N, Mendoza L, Moritz RL, Aebersold R, Nesvizhskii AI
JournalMolecular & cellular proteomics : MCP
Date PublishedAug 29
PMID21876204
AbstractThe combination of tandem mass spectrometry (MS/MS) and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of MS/MS spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline (TPP). Applied in tandem with PeptideProphet, it provides more accurate representation of the multi-level nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the TPP increases the number of correctly identified peptides at a constant false discovery rate (FDR) as compared to both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and FDR estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the TPP, it supports all commonly used MS instruments, search engines and computer platforms. The performance of iProphet is demonstrated on two publicly available datasets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic datasets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite datasets.
Short TitleMol Cell Proteomics
Alternate JournalMol Cell Proteomics

Back