Contribution to a conference proceedings/Contribution to a book FZJ-2017-04094

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Seamless HPC Integration of Data-Intensive KNIME Workflows via UNICORE

 ;  ;  ;  ;  ;

2017
Springer International Publishing Cham
ISBN: 978-3-319-58942-8 (print), 978-3-319-58943-5 (electronic)

Euro-Par 2016: Parallel Processing Workshops / Desprez, Frederic (Editor) ; Cham : Springer International Publishing, 2017, Chapter 39 ; ISSN: 0302-9743=1611-3349 ; ISBN: 978-3-319-58942-8=978-3-319-58943-5 ; doi:10.1007/978-3-319-58943-5
European Conference on Parallel Processing, Euro-Par, GrenobleGrenoble, France, 24 Aug 2016 - 26 Aug 20162016-08-242016-08-26
Cham : Springer International Publishing, Lecture Notes in Computer Science 10104, 480 - 491 () [10.1007/978-3-319-58943-5_39]

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: Biological research is increasingly dependent on analyzing vast amounts of microscopy datasets. Technologies such as Fiji/ImageJ2 and KNIME support knowledge extraction from biological data by providing a large set of configurable algorithms and an intuitive pipeline creation and execution interface. The increasing complexity of required analysis pipelines and the growing amounts of data to be processed nurture the desire to run existing pipelines on HPC (High Performance Computing) systems. Here, we propose a solution to this challenge by presenting a new HPC integration method for KNIME (Konstanz Information Miner) using the UNICORE middleware (Uniform Interface to Computing Resources) and its automated data processing feature. We designed the integration to be efficient in processing large data workloads on the server side. On the client side it is seamless and lightweight to only minimally increase the complexity for the users. We describe our novel approach and evaluate it using an image processing pipeline that could previously not be executed on an HPC system. The evaluation includes a performance study of the induced overhead of the submission process and of the integrated image processing pipeline based on a large amount of data. This demonstrates how our solution enables scientists to transparently benefit from vast HPC resources without the need to migrate existing algorithms and pipelines.

Classification:

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 512 - Data-Intensive Science and Federated Computing (POF3-512) (POF3-512)

Appears in the scientific report 2017
Database coverage:
NationallizenzNationallizenz ; SCOPUS
Click to display QR Code for this record

The record appears in these collections:
Document types > Events > Contributions to a conference proceedings
Document types > Books > Contribution to a book
Workflow collections > Public records
Institute Collections > JSC
Publications database

 Record created 2017-06-13, last modified 2021-01-29



Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)