000864070 001__ 864070
000864070 005__ 20230127125336.0
000864070 0247_ $$2Handle$$a2128/24952
000864070 037__ $$aFZJ-2019-03979
000864070 041__ $$aEnglish
000864070 1001_ $$0P:(DE-Juel1)165903$$aKaffashzadeh, Najmeh$$b0$$eCorresponding author$$ufzj
000864070 245__ $$aA New Tool for Automated Quality Control of  Environmental Data in Open Web Services
000864070 260__ $$c2019
000864070 3367_ $$0PUB:(DE-HGF)25$$2PUB:(DE-HGF)$$aPreprint$$bpreprint$$mpreprint$$s1564042965_25756
000864070 3367_ $$2ORCID$$aWORKING_PAPER
000864070 3367_ $$028$$2EndNote$$aElectronic Article
000864070 3367_ $$2DRIVER$$apreprint
000864070 3367_ $$2BibTeX$$aARTICLE
000864070 3367_ $$2DataCite$$aOutput Types/Working Paper
000864070 520__ $$aWe report on the development of a new software tool (auto-qc) for automated quality control (QC) of environmental timeseries data. Novel features of this tool include a flexible Python software architecture, which makes it easy for users to configure the sequence of tests as well as their statistical parameters, and a statistical concept to assign each value a probability of being a correct value. There are many occasions when it is necessary to inspect the quality of environmental datasets, from first quality checks during real-time sampling and data transmission to assessing the quality of long-term monitoring data from measurement stations. Erroneous data can have a substantial impact on the statistical data analysis and, for example, lead to wrong estimates of trends. Existing QC workflows largely rely on individual investigator knowledge and have often been constructed from practical considerations alone. Our tool aims to complement traditional data quality analyses and adds some insights into the nature of the individual tests that are being applied.
000864070 536__ $$0G:(DE-HGF)POF3-512$$a512 - Data-Intensive Science and Federated Computing (POF3-512)$$cPOF3-512$$fPOF III$$x0
000864070 536__ $$0G:(EU-Grant)787576$$aIntelliAQ - Artificial Intelligence for Air Quality (787576)$$c787576$$fERC-2017-ADG$$x1
000864070 536__ $$0G:(DE-Juel-1)ESDE$$aEarth System Data Exploration (ESDE)$$cESDE$$x2
000864070 7001_ $$0P:(DE-Juel1)176602$$aKleinert, Felix$$b1$$ufzj
000864070 7001_ $$0P:(DE-Juel1)6952$$aSchultz, Martin$$b2$$ufzj
000864070 8564_ $$uhttps://easychair.org/publications/preprint/cqRB
000864070 8564_ $$uhttps://juser.fz-juelich.de/record/864070/files/EasyChair-Preprint-1325.pdf$$yOpenAccess
000864070 8564_ $$uhttps://juser.fz-juelich.de/record/864070/files/EasyChair-Preprint-1325.pdf?subformat=pdfa$$xpdfa$$yOpenAccess
000864070 909CO $$ooai:juser.fz-juelich.de:864070$$pdnbdelivery$$pec_fundedresources$$pVDB$$pdriver$$popen_access$$popenaire
000864070 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)165903$$aForschungszentrum Jülich$$b0$$kFZJ
000864070 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176602$$aForschungszentrum Jülich$$b1$$kFZJ
000864070 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)6952$$aForschungszentrum Jülich$$b2$$kFZJ
000864070 9131_ $$0G:(DE-HGF)POF3-512$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vData-Intensive Science and Federated Computing$$x0
000864070 9141_ $$y2019
000864070 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000864070 920__ $$lyes
000864070 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000864070 980__ $$apreprint
000864070 980__ $$aVDB
000864070 980__ $$aUNRESTRICTED
000864070 980__ $$aI:(DE-Juel1)JSC-20090406
000864070 9801_ $$aFullTexts