TY  - JOUR
AU  - Holl, Sonja
AU  - Mohammed, Yassene
AU  - Zimmermann, Olav
AU  - Palmblad, Magnus
TI  - Scientific Workflow Optimization for Improved Peptide and Protein Identification
JO  - BMC bioinformatics
VL  - 16
SN  - 1471-2105
CY  - London
PB  - BioMed Central
M1  - FZJ-2016-00614
SP  - 284
PY  - 2015
AB  - Background: Peptide-spectrum matching is a common step in most data processing workflows for massspectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have beendeveloped to address this task. However, these algorithms typically require the user to select instrument- andsample-dependent parameters, such as mass measurement error tolerances and number of missed enzymaticcleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledgeabout the data as well as the algorithms themselves is needed. Most researchers therefore tend to use defaultparameters, which are not necessarily optimal.Results: We have applied a new optimization framework for the Taverna scientific workflow management system(http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientificworkflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated byseveral phenomena that can be observed when allowing for larger mass measurement errors in sequence databasesearches. On-the-fly parameter optimization embedded in scientific workflow management systems enables expertsand non-experts alike to extract the maximum amount of information from the data. The same workflows could beused for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but alsofor other tasks, such as retention time prediction.Conclusion: Using the optimization framework, we were able to learn about how the data was acquired as well asthe explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptideswith N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gainedwith the extension of the common range for the mass measurement error tolerance parameters explored by theoptimization framework.
LB  - PUB:(DE-HGF)16
UR  - <Go to ISI:>//WOS:000360426000008
C6  - pmid:26335531
DO  - DOI:10.1186/s12859-015-0714-x
UR  - https://juser.fz-juelich.de/record/280903
ER  -