Learning from the present for the future: The Jülich LOFAR Long-term Archive

Manzano, C.; Vybornov, V.; Miskolczi, A.; Fieseler, T.; Stiele, H.; Pfalzner, S.
doi:10.1016/j.ascom.2024.100835
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@ARTICLE{Manzano:1027280,
      author       = {Manzano, C. and Miskolczi, A. and Stiele, H. and Vybornov,
                      V. and Fieseler, T. and Pfalzner, S.},
      title        = {{L}earning from the present for the future: {T}he {J}ülich
                      {LOFAR} {L}ong-term {A}rchive},
      journal      = {Astronomy and computing},
      volume       = {48},
      issn         = {2213-1337},
      address      = {Amsterdam [u.a.]},
      publisher    = {Elsevier},
      reportid     = {FZJ-2024-03724},
      pages        = {100835 -},
      year         = {2024},
      abstract     = {The Forschungszentrum Jülich has been hosting the German
                      part of the LOFAR archive since 2013. It is Germany’s most
                      extensive radio astronomy archive, currently storing nearly
                      22 petabytes (PB) of data. Future radio telescopes are
                      expected to require a dramatic increase in long-term data
                      storage. Here, we take stock of the current data management
                      of the Jülich LOFAR Data Archive, describe the ingestion,
                      the storage system, the export to the long-term archive, and
                      the request chain. We analysed the data availability over
                      the last 10 years and searched for the underlying data
                      access pattern and the energy consumption of the process. We
                      determine hardware-related limiting factors, such as network
                      bandwidth and cache pool availability and performance, and
                      software aspects, e.g. workflow adjustment and parameter
                      tuning, as the main data storage bottlenecks. By contrast,
                      the challenge in providing the data from the archive for the
                      users lies in retrieving the data from the tape archive and
                      staging them. Building on this analysis, we suggest how to
                      avoid/mitigate these problems in the future and define the
                      requirements for future even more extensive long-term data
                      archives.},
      cin          = {JSC},
      ddc          = {520},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {5111 - Domain-Specific Simulation $\&$ Data Life Cycle Labs
                      (SDLs) and Research Groups (POF4-511) / 5121 -
                      Supercomputing $\&$ Big Data Facilities (POF4-512) / DFG
                      project 460248186 - PUNCH4NFDI - Teilchen, Universum, Kerne
                      und Hadronen für die NFDI (460248186) / Big Bang to Big
                      Data - B3D [NRW-Cluster für datenintensive Radioastronomie]
                      (PROFILNRW-2020-038B)},
      pid          = {G:(DE-HGF)POF4-5111 / G:(DE-HGF)POF4-5121 /
                      G:(GEPRIS)460248186 / G:(DE-Juel-1)PROFILNRW-2020-038B},
      typ          = {PUB:(DE-HGF)16},
      UT           = {WOS:001250562200001},
      doi          = {10.1016/j.ascom.2024.100835},
      url          = {https://juser.fz-juelich.de/record/1027280},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help