Journal Article FZJ-2016-00614

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Scientific Workflow Optimization for Improved Peptide and Protein Identification

 ;  ;  ;

2015
BioMed Central London

BMC bioinformatics 16, 284 () [10.1186/s12859-015-0714-x]

This record in other databases:      

Please use a persistent id in citations:   doi:

Abstract: Background: Peptide-spectrum matching is a common step in most data processing workflows for massspectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have beendeveloped to address this task. However, these algorithms typically require the user to select instrument- andsample-dependent parameters, such as mass measurement error tolerances and number of missed enzymaticcleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledgeabout the data as well as the algorithms themselves is needed. Most researchers therefore tend to use defaultparameters, which are not necessarily optimal.Results: We have applied a new optimization framework for the Taverna scientific workflow management system(http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientificworkflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated byseveral phenomena that can be observed when allowing for larger mass measurement errors in sequence databasesearches. On-the-fly parameter optimization embedded in scientific workflow management systems enables expertsand non-experts alike to extract the maximum amount of information from the data. The same workflows could beused for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but alsofor other tasks, such as retention time prediction.Conclusion: Using the optimization framework, we were able to learn about how the data was acquired as well asthe explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptideswith N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gainedwith the extension of the common range for the mass measurement error tolerance parameters explored by theoptimization framework.

Classification:

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 511 - Computational Science and Mathematical Methods (POF3-511) (POF3-511)
  2. 512 - Data-Intensive Science and Federated Computing (POF3-512) (POF3-512)

Appears in the scientific report 2015
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; DOAJ ; OpenAccess ; BIOSIS Previews ; IF < 5 ; JCR ; NCBI Molecular Biology Database ; SCOPUS ; Science Citation Index Expanded ; Thomson Reuters Master Journal List ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Aufsätze > Zeitschriftenaufsätze
Workflowsammlungen > Öffentliche Einträge
Institutssammlungen > JSC
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2016-01-18, letzte Änderung am 2021-01-29


Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)