000864041 001__ 864041
000864041 005__ 20230712163306.0
000864041 0247_ $$2Handle$$a2128/22561
000864041 037__ $$aFZJ-2019-03957
000864041 041__ $$aEnglish
000864041 1001_ $$0P:(DE-Juel1)165903$$aKaffashzadeh, Najmeh$$b0$$eCorresponding author
000864041 1112_ $$aEuropean Geoscience Union (EGU)$$cVienna$$d2019-04-07 - 2019-04-12$$wAustria
000864041 245__ $$aA Novel Concept for Automated Quality Control of Atmospheric Time Series
000864041 260__ $$c2019
000864041 3367_ $$033$$2EndNote$$aConference Paper
000864041 3367_ $$2BibTeX$$aINPROCEEDINGS
000864041 3367_ $$2DRIVER$$aconferenceObject
000864041 3367_ $$2ORCID$$aCONFERENCE_POSTER
000864041 3367_ $$2DataCite$$aOutput Types/Conference Poster
000864041 3367_ $$0PUB:(DE-HGF)24$$2PUB:(DE-HGF)$$aPoster$$bposter$$mposter$$s1565254873_22949$$xAfter Call
000864041 520__ $$aMeasurements of atmospheric physical and chemical parameters are essential for atmospheric model evaluation,trend analysis, climate prediction, and other applications. Particularly when the time series from various measure-ment instruments or data providers are merged together, assessing the quality of the data presents a major challengeand often relies on subjective screening. The quality of the time series can be affected by several error types, suchas random error, systematic error due to calibration errors, and gross error from malfunctioning instruments, ordata processing errors, such as mistyped values and improper date-time formats. Some of these errors may havea considerable impact on the statistical analysis of the time series. Thus, identifying the quality of the data, i.e.quality control (QC), is an essential step for any data analysis.Here, we present a software package for the automated QC of the atmospheric time series based on the use ofseveral algorithms that are in use at various environmental agencies and research initiatives. The tool can either beembedded in automated workflows to process real-time data or be applied to a second-level analysis of archivedmulti-year data. Several statistical tests are grouped in categories with increasing complexity. Any number of testscan be defined and run sequentially. The set of statistical tests and any user arguments can easily be configuredwith variable-specific control files in the JSON format. This allows for easy integration into an automated work-flow software and distributed data processing services.For expressing the quality of a measured data series, we introduced a probability concept which assigns each valuea likelihood of being "good" data. Here, "good" is interpreted in a statistical sense as belonging to an expectedprobability distribution. Some of the tests influence not only the probability of a single point but may also impacton the probability of its neighboring points.We tested the software with multi-annual hourly ozone and temperature data from the database of the TroposphericOzone Assessment Report (TOAR). Preliminary results indicate that the concept works well and is able to dealwith a large and heterogeneous dataset such as the global collection of ozone data in the TOAR database.
000864041 536__ $$0G:(DE-HGF)POF3-512$$a512 - Data-Intensive Science and Federated Computing (POF3-512)$$cPOF3-512$$fPOF III$$x0
000864041 536__ $$0G:(EU-Grant)787576$$aIntelliAQ - Artificial Intelligence for Air Quality (787576)$$c787576$$fERC-2017-ADG$$x1
000864041 536__ $$0G:(DE-Juel-1)ESDE$$aEarth System Data Exploration (ESDE)$$cESDE$$x2
000864041 7001_ $$0P:(DE-Juel1)16212$$aSchröder, Sabine$$b1
000864041 7001_ $$0P:(DE-Juel1)6952$$aSchultz, Martin$$b2$$ufzj
000864041 8564_ $$uhttps://juser.fz-juelich.de/record/864041/files/Najmeh_EGU_2019_lv.pdf$$yOpenAccess
000864041 8564_ $$uhttps://juser.fz-juelich.de/record/864041/files/Najmeh_EGU_2019_lv.pdf?subformat=pdfa$$xpdfa$$yOpenAccess
000864041 909CO $$ooai:juser.fz-juelich.de:864041$$pec_fundedresources$$pdriver$$pVDB$$popen_access$$popenaire
000864041 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000864041 9141_ $$y2019
000864041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)165903$$aForschungszentrum Jülich$$b0$$kFZJ
000864041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)16212$$aForschungszentrum Jülich$$b1$$kFZJ
000864041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)6952$$aForschungszentrum Jülich$$b2$$kFZJ
000864041 9131_ $$0G:(DE-HGF)POF3-512$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vData-Intensive Science and Federated Computing$$x0
000864041 920__ $$lyes
000864041 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000864041 980__ $$aposter
000864041 980__ $$aVDB
000864041 980__ $$aUNRESTRICTED
000864041 980__ $$aI:(DE-Juel1)JSC-20090406
000864041 980__ $$aOPENSCIENCE
000864041 9801_ $$aFullTexts