Home > Publications database > Scientific Workflow Optimization for Improved Peptide and Protein Identification |
Journal Article | FZJ-2016-00614 |
; ; ;
2015
BioMed Central
London
This record in other databases:
Please use a persistent id in citations: http://hdl.handle.net/2128/9723 doi:10.1186/s12859-015-0714-x
Abstract: Background: Peptide-spectrum matching is a common step in most data processing workflows for massspectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have beendeveloped to address this task. However, these algorithms typically require the user to select instrument- andsample-dependent parameters, such as mass measurement error tolerances and number of missed enzymaticcleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledgeabout the data as well as the algorithms themselves is needed. Most researchers therefore tend to use defaultparameters, which are not necessarily optimal.Results: We have applied a new optimization framework for the Taverna scientific workflow management system(http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientificworkflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated byseveral phenomena that can be observed when allowing for larger mass measurement errors in sequence databasesearches. On-the-fly parameter optimization embedded in scientific workflow management systems enables expertsand non-experts alike to extract the maximum amount of information from the data. The same workflows could beused for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but alsofor other tasks, such as retention time prediction.Conclusion: Using the optimization framework, we were able to learn about how the data was acquired as well asthe explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptideswith N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gainedwith the extension of the common range for the mass measurement error tolerance parameters explored by theoptimization framework.
![]() |
The record appears in these collections: |