Package DataQualityCheckEuracAlpEnv

Tools for data quality check of microclimate stations network of Institute of Alpine Environment - Eurac Research

Introduction

The R package DataQualityCheckEuracAlpEnv provide functions and examples to manage data quality for automatic microclimatic stations network. The stations collect data of many sensors and send datatables at regular intervals via GSM using Loggernet, a software developed by Campbell Scientific to manage loggers. Many problems affects raw data due to missing connections, due to manual preprocessing or due to software updates. We need to check if the data downloaded were well formatted and detect, as soon as possbile, failures of the sensors installed. For these reasons we developed the DataQualityCheckEuracAlpEnv package containg usefull functions and scripts to different purpose.

The newtwork of stations managed by Institute of Alpine Environment consist in 28 microclimatic stations used mainly for research purpose, ecology, hydrology and climate change impact are the study fields.

The stations belong to 2 project:

Goals

  1. Manage and check real time data, collect them, detect possible bugs and outliers, and save, if it is possible in a regular time series usable from researcher. To do that we developed the script DQC_Hourly_Linux_v6.R that runs in a cronjob every hour. It is used for urgent problems. To decide when a notice is needed this script is paired with DQC_Reset_Mail_Status.R .

  2. Manage and check pics. The script DQC_Pics.R collect and organize pics coming from the stations to the storage, highlighting possible corruption to prevent wrong pubblication of bad quality pictures on the website

  3. Analysis of troubles occoured in the last period. This is done by the script DQC_Reports.R. It is a tool for mainenace to have an overview of the healt status of the stations and sensors, detecting anomlies and exceptional events. This script runs automatically (cronjob) every week.

  4. Analysis and fixing of historical data. It is used to check old data and old files, to detect structure change and to highline the typical problem of the manual preprocesing. The script that do that is DQC.R and is used to prepare hystorical data.

  5. Clean download data folder moving in a subfolder data with wrong file names. The script DQC_Move_Wrong_Files.R detect possible IP errors analyzing the file name.

For stability reason the scripts run on a Linux virtual machine called HPCgeo01 prepared by the ICT. For the historical analysis the script was structured for an usage on a Windows machine. We are developing an user friendly interface to help the user to configure paths structure and settings.

How to install

git clone https://gitlab.inf.unibz.it/Christian.Brida/dataqualitycheckeuracalpenv.git

Repository structure

Contributors & Contacts:



bridachristian/DataQualityCheckEuracAlpEnv documentation built on Oct. 27, 2019, 5:55 p.m.