Seamless HPC Integration of Data-Intensive KNIME Workflows via UNICORE

Grunzke, Richard; Schuller, Bernd; Jug, Florian; Myers, Gene; Nagel, Wolfgang E.; Jäkel, Rene
doi:10.1007/978-3-319-58943-5_39
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Grunzke:834093,
      author       = {Grunzke, Richard and Jug, Florian and Schuller, Bernd and
                      Jäkel, Rene and Myers, Gene and Nagel, Wolfgang E.},
      title        = {{S}eamless {HPC} {I}ntegration of {D}ata-{I}ntensive
                      {KNIME} {W}orkflows via {UNICORE}},
      volume       = {10104},
      address      = {Cham},
      publisher    = {Springer International Publishing},
      reportid     = {FZJ-2017-04094},
      isbn         = {978-3-319-58942-8 (print)},
      series       = {Lecture Notes in Computer Science},
      pages        = {480 - 491},
      year         = {2017},
      comment      = {Euro-Par 2016: Parallel Processing Workshops / Desprez,
                      Frederic (Editor) ; Cham : Springer International
                      Publishing, 2017, Chapter 39 ; ISSN: 0302-9743=1611-3349 ;
                      ISBN: 978-3-319-58942-8=978-3-319-58943-5 ;
                      doi:10.1007/978-3-319-58943-5},
      booktitle     = {Euro-Par 2016: Parallel Processing
                       Workshops / Desprez, Frederic (Editor)
                       ; Cham : Springer International
                       Publishing, 2017, Chapter 39 ; ISSN:
                       0302-9743=1611-3349 ; ISBN:
                       978-3-319-58942-8=978-3-319-58943-5 ;
                       doi:10.1007/978-3-319-58943-5},
      abstract     = {Biological research is increasingly dependent on analyzing
                      vast amounts of microscopy datasets. Technologies such as
                      Fiji/ImageJ2 and KNIME support knowledge extraction from
                      biological data by providing a large set of configurable
                      algorithms and an intuitive pipeline creation and execution
                      interface. The increasing complexity of required analysis
                      pipelines and the growing amounts of data to be processed
                      nurture the desire to run existing pipelines on HPC (High
                      Performance Computing) systems. Here, we propose a solution
                      to this challenge by presenting a new HPC integration method
                      for KNIME (Konstanz Information Miner) using the UNICORE
                      middleware (Uniform Interface to Computing Resources) and
                      its automated data processing feature. We designed the
                      integration to be efficient in processing large data
                      workloads on the server side. On the client side it is
                      seamless and lightweight to only minimally increase the
                      complexity for the users. We describe our novel approach and
                      evaluate it using an image processing pipeline that could
                      previously not be executed on an HPC system. The evaluation
                      includes a performance study of the induced overhead of the
                      submission process and of the integrated image processing
                      pipeline based on a large amount of data. This demonstrates
                      how our solution enables scientists to transparently benefit
                      from vast HPC resources without the need to migrate existing
                      algorithms and pipelines.},
      month         = {Aug},
      date          = {2016-08-24},
      organization  = {European Conference on Parallel
                       Processing, Grenoble (France), 24 Aug
                       2016 - 26 Aug 2016},
      cin          = {JSC},
      ddc          = {004},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {512 - Data-Intensive Science and Federated Computing
                      (POF3-512)},
      pid          = {G:(DE-HGF)POF3-512},
      typ          = {PUB:(DE-HGF)8 / PUB:(DE-HGF)7},
      UT           = {WOS:000529303100039},
      doi          = {10.1007/978-3-319-58943-5_39},
      url          = {https://juser.fz-juelich.de/record/834093},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help