Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data

Betancourt, Clara; Schultz, Martin G.; Li, Cathy W. Y.; Kleinert, Felix
doi:10.1021/acs.est.3c05104
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@ARTICLE{Betancourt:1014687,
      author       = {Betancourt, Clara and Li, Cathy W. Y. and Kleinert, Felix
                      and Schultz, Martin G.},
      title        = {{G}raph {M}achine {L}earning for {I}mproved {I}mputation of
                      {M}issing {T}ropospheric {O}zone {D}ata},
      journal      = {Environmental science $\&$ technology},
      volume       = {57},
      issn         = {0013-936X},
      address      = {Columbus, Ohio},
      publisher    = {American Chemical Society},
      reportid     = {FZJ-2023-03392},
      pages        = {18246-18258},
      year         = {2023},
      abstract     = {Gaps in the measurement series of atmospheric pollutants
                      can impede the reliable assessment of their impacts and
                      trends. We propose a new method for missing data imputation
                      of the air pollutant tropospheric ozone by using the graph
                      machine learning algorithm “correct and smooth”. This
                      algorithm uses auxiliary data that characterize the
                      measurement location and, in addition, ozone observations at
                      neighboring sites to improve the imputations of simple
                      statistical and machine learning models. We apply our method
                      to data from 278 stations of the year 2011 of the German
                      Environment Agency (Umweltbundesamt – UBA) monitoring
                      network. The preliminary version of these data exhibits
                      three gap patterns: shorter gaps in the range of hours,
                      longer gaps of up to several months in length, and gaps
                      occurring at multiple stations at once. For short gaps of up
                      to 5 h, linear interpolation is most accurate. Longer gaps
                      at single stations are most effectively imputed by a random
                      forest in connection with the correct and smooth. For longer
                      gaps at multiple stations, the correct and smooth algorithm
                      improved the random forest despite a lack of data in the
                      neighborhood of the missing values. We therefore suggest a
                      hybrid of linear interpolation and graph machine learning
                      for the imputation of tropospheric ozone time series.},
      cin          = {JSC},
      ddc          = {333.7},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {5111 - Domain-Specific Simulation $\&$ Data Life Cycle Labs
                      (SDLs) and Research Groups (POF4-511) / IntelliAQ -
                      Artificial Intelligence for Air Quality (787576)},
      pid          = {G:(DE-HGF)POF4-5111 / G:(EU-Grant)787576},
      typ          = {PUB:(DE-HGF)16},
      pubmed       = {37661931},
      UT           = {WOS:001061743500001},
      doi          = {10.1021/acs.est.3c05104},
      url          = {https://juser.fz-juelich.de/record/1014687},
}
Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe