001     1052199
005     20260203123451.0
024 7 _ |a 10.1186/s40537-025-01306-3
|2 doi
024 7 _ |a 10.34734/FZJ-2026-00834
|2 datacite_doi
037 _ _ |a FZJ-2026-00834
082 _ _ |a 004
100 1 _ |a Leuridan, Mathilde
|0 P:(DE-Juel1)207974
|b 0
|e Corresponding author
245 _ _ |a Polytope: an algorithm for efficient feature extraction on hypercubes
260 _ _ |a Heidelberg [u.a.]
|c 2025
|b SpringerOpen
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1769067995_24863
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a Data extraction algorithms on data hypercubes, or datacubes, are traditionally only capable of cutting boxes of data along the datacube axes. For many use cases however, this returns much more data than users actually need, leading to an unnecessary consumption of I/O resources. In this paper, we propose an alternative feature extraction technique, which carefully computes the indices of data points contained within user-requested shapes. This enables data storage systems to only read and return bytes useful to user applications from the datacube. Our main algorithm is based on high-dimensional computational geometry concepts and operates by successively reducing polytopes down to the points contained within them. We analyse this algorithm in detail before providing results about its performance and scalability. In particular, we show it is possible to achieve data reductions of up to 99% using this algorithm instead of current state of practice data extraction methods, such as meteorological field extractions from ECMWF’s FDB data store, where feature shapes are extracted a posteriori as a post-processing step. As we discuss later on, this novel extraction method will considerably help scale access to large petabyte size data hypercubes in a variety of scientific fields.
536 _ _ |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5111
|c POF4-511
|f POF IV
|x 0
536 _ _ |a Earth System Data Exploration (ESDE)
|0 G:(DE-Juel-1)ESDE
|c ESDE
|x 1
588 _ _ |a Dataset connected to DataCite
700 1 _ |a Hawkes, James
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Smart, Simon
|0 P:(DE-HGF)0
|b 2
700 1 _ |a Danovaro, Emanuele
|0 P:(DE-HGF)0
|b 3
700 1 _ |a Schultz, Martin
|0 P:(DE-Juel1)6952
|b 4
|u fzj
700 1 _ |a Quintino, Tiago
|0 P:(DE-HGF)0
|b 5
773 _ _ |a 10.1186/s40537-025-01306-3
|g Vol. 12, no. 1, p. 243
|0 PERI:(DE-600)2780218-8
|n 1
|p 243
|t Journal of Big Data
|v 12
|y 2025
|x 2196-1115
856 4 _ |u https://juser.fz-juelich.de/record/1052199/files/s40537-025-01306-3.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1052199
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)6952
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5111
|x 0
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2025-01-06
915 _ _ |a Creative Commons Attribution CC BY 4.0
|0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2025-01-06
915 _ _ |a Fees
|0 StatID:(DE-HGF)0700
|2 StatID
|d 2025-01-06
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a Article Processing Charges
|0 StatID:(DE-HGF)0561
|2 StatID
|d 2025-01-06
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b J BIG DATA-GER : 2022
|d 2025-11-06
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2025-11-06
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2025-11-06
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0501
|2 StatID
|b DOAJ Seal
|d 2025-08-21T14:03:27Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0500
|2 StatID
|b DOAJ
|d 2025-08-21T14:03:27Z
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b DOAJ : Anonymous peer review
|d 2025-08-21T14:03:27Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2025-11-06
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1160
|2 StatID
|b Current Contents - Engineering, Computing and Technology
|d 2025-11-06
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2025-11-06
915 _ _ |a IF >= 5
|0 StatID:(DE-HGF)9905
|2 StatID
|b J BIG DATA-GER : 2022
|d 2025-11-06
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21