Journal Article FZJ-2026-00834

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Polytope: an algorithm for efficient feature extraction on hypercubes

 ;  ;  ;  ;  ;

2025
SpringerOpen Heidelberg [u.a.]

Journal of Big Data 12(1), 243 () [10.1186/s40537-025-01306-3]

This record in other databases:  

Please use a persistent id in citations: doi:  doi:

Abstract: Data extraction algorithms on data hypercubes, or datacubes, are traditionally only capable of cutting boxes of data along the datacube axes. For many use cases however, this returns much more data than users actually need, leading to an unnecessary consumption of I/O resources. In this paper, we propose an alternative feature extraction technique, which carefully computes the indices of data points contained within user-requested shapes. This enables data storage systems to only read and return bytes useful to user applications from the datacube. Our main algorithm is based on high-dimensional computational geometry concepts and operates by successively reducing polytopes down to the points contained within them. We analyse this algorithm in detail before providing results about its performance and scalability. In particular, we show it is possible to achieve data reductions of up to 99% using this algorithm instead of current state of practice data extraction methods, such as meteorological field extractions from ECMWF’s FDB data store, where feature shapes are extracted a posteriori as a post-processing step. As we discuss later on, this novel extraction method will considerably help scale access to large petabyte size data hypercubes in a variety of scientific fields.

Classification:

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)
  2. Earth System Data Exploration (ESDE) (ESDE)

Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; DOAJ ; OpenAccess ; Article Processing Charges ; Clarivate Analytics Master Journal List ; Current Contents - Engineering, Computing and Technology ; DOAJ Seal ; Essential Science Indicators ; Fees ; IF >= 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2026-01-22, last modified 2026-02-03


OpenAccess:
Download fulltext PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)