001052199 001__ 1052199
001052199 005__ 20260203123451.0
001052199 0247_ $$2doi$$a10.1186/s40537-025-01306-3
001052199 0247_ $$2datacite_doi$$a10.34734/FZJ-2026-00834
001052199 037__ $$aFZJ-2026-00834
001052199 082__ $$a004
001052199 1001_ $$0P:(DE-Juel1)207974$$aLeuridan, Mathilde$$b0$$eCorresponding author
001052199 245__ $$aPolytope: an algorithm for efficient feature extraction on hypercubes
001052199 260__ $$aHeidelberg [u.a.]$$bSpringerOpen$$c2025
001052199 3367_ $$2DRIVER$$aarticle
001052199 3367_ $$2DataCite$$aOutput Types/Journal article
001052199 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1769067995_24863
001052199 3367_ $$2BibTeX$$aARTICLE
001052199 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001052199 3367_ $$00$$2EndNote$$aJournal Article
001052199 520__ $$aData extraction algorithms on data hypercubes, or datacubes, are traditionally only capable of cutting boxes of data along the datacube axes. For many use cases however, this returns much more data than users actually need, leading to an unnecessary consumption of I/O resources. In this paper, we propose an alternative feature extraction technique, which carefully computes the indices of data points contained within user-requested shapes. This enables data storage systems to only read and return bytes useful to user applications from the datacube. Our main algorithm is based on high-dimensional computational geometry concepts and operates by successively reducing polytopes down to the points contained within them. We analyse this algorithm in detail before providing results about its performance and scalability. In particular, we show it is possible to achieve data reductions of up to 99% using this algorithm instead of current state of practice data extraction methods, such as meteorological field extractions from ECMWF’s FDB data store, where feature shapes are extracted a posteriori as a post-processing step. As we discuss later on, this novel extraction method will considerably help scale access to large petabyte size data hypercubes in a variety of scientific fields.
001052199 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001052199 536__ $$0G:(DE-Juel-1)ESDE$$aEarth System Data Exploration (ESDE)$$cESDE$$x1
001052199 588__ $$aDataset connected to DataCite
001052199 7001_ $$0P:(DE-HGF)0$$aHawkes, James$$b1
001052199 7001_ $$0P:(DE-HGF)0$$aSmart, Simon$$b2
001052199 7001_ $$0P:(DE-HGF)0$$aDanovaro, Emanuele$$b3
001052199 7001_ $$0P:(DE-Juel1)6952$$aSchultz, Martin$$b4$$ufzj
001052199 7001_ $$0P:(DE-HGF)0$$aQuintino, Tiago$$b5
001052199 773__ $$0PERI:(DE-600)2780218-8$$a10.1186/s40537-025-01306-3$$gVol. 12, no. 1, p. 243$$n1$$p243$$tJournal of Big Data$$v12$$x2196-1115$$y2025
001052199 8564_ $$uhttps://juser.fz-juelich.de/record/1052199/files/s40537-025-01306-3.pdf$$yOpenAccess
001052199 909CO $$ooai:juser.fz-juelich.de:1052199$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
001052199 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)6952$$aForschungszentrum Jülich$$b4$$kFZJ
001052199 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001052199 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2025-01-06
001052199 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001052199 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2025-01-06
001052199 915__ $$0StatID:(DE-HGF)0700$$2StatID$$aFees$$d2025-01-06
001052199 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001052199 915__ $$0StatID:(DE-HGF)0561$$2StatID$$aArticle Processing Charges$$d2025-01-06
001052199 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bJ BIG DATA-GER : 2022$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)0501$$2StatID$$aDBCoverage$$bDOAJ Seal$$d2025-08-21T14:03:27Z
001052199 915__ $$0StatID:(DE-HGF)0500$$2StatID$$aDBCoverage$$bDOAJ$$d2025-08-21T14:03:27Z
001052199 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bDOAJ : Anonymous peer review$$d2025-08-21T14:03:27Z
001052199 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)1160$$2StatID$$aDBCoverage$$bCurrent Contents - Engineering, Computing and Technology$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2025-11-06
001052199 915__ $$0StatID:(DE-HGF)9905$$2StatID$$aIF >= 5$$bJ BIG DATA-GER : 2022$$d2025-11-06
001052199 920__ $$lyes
001052199 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001052199 980__ $$ajournal
001052199 980__ $$aVDB
001052199 980__ $$aUNRESTRICTED
001052199 980__ $$aI:(DE-Juel1)JSC-20090406
001052199 9801_ $$aFullTexts