kendallRandomPackage

The goal of this package is to make analyses of air pollution data taken from GIOŚ website simple and manageable. Measurement data are scattered across many spreadsheet files, which makes them not suitable for automated analyses by polutant and year in R. This package provides basic tools for converting those datasets in R-friendly format along with a Shiny App for visualizations and GEV distribution fitting.

About the data

Data up until 2015 ca be found on GIOŚ website. Data from 2016 will be posted soon. ZIP files contains spreadsheets with data on a polutant given in the file name. 1g in the filename means marks hourly data, 24g - daily measurements. Different units are used for different polutants and they are not given in the spreadsheet. They can be checked on other GIOŚ pages. Alternatively, one can used standardized data.

Some stations' names were changed. All the names can be found in stationCodes vector that is a part of the package. Old and new names can be used interchangeably while calling functions importGiosFromXLSX and importGiosFromCSV.

Some remarks

  1. Not all files contain data for all stations. If a station is not found in a given file, an empty tibble (a data frame) is returned.
  2. Spreadsheet format is problematic for many reasons. Two main problems are memory usage and missing observations. Importing csv files is at least 6 times faster. Also, when there's a lot of missing observations for a given stations, R can't guess the type of a column correctly, which causes errors. Thus, converting the files to csv format is advised. For example, this can be done automaticaly under Linux using LibreOffice headless utility.

Required packages.

dplyr, tibble, lubridate, stringr and tidyr packages are required to use this package. If data are imported using importGiosFromXLSX, readxl package must also be loaded. If importGiosFromCSV is used, readr package is required.

Importing data

Examples for importGiosFromXLSX and importGiosFromCSV function can be found in package documentation using the ? syntax.

Both of these functions search for raw files downloaded from GIOŚ website or csv versions of these files under their default names or names provided as arguments to these function. Details can be found in package documentation. For almost all purposes, only arguments station, polutants, years and path must be provided.

Maxima over given period can be calculated using calculateMaxima functions. Again, the details can be found with the help of ? syntax.



mstaniak/giosDownloader documentation built on Aug. 27, 2019, 6:33 a.m.