mc_read_data: Reading files with locality metadata

View source: R/read.R

mc_read_dataR Documentation

Reading files with locality metadata

Description

This function has two tables as the parameters.

(i) files_table is required parameter, it ust contain paths pointing to raw csv logger files, specification of data format (logger type) and locality name.

(ii) localities_table is optional, containing locality id and metadata e.g. longitude, latitude, elevation...

Usage

mc_read_data(
  files_table,
  localities_table = NULL,
  clean = TRUE,
  silent = FALSE,
  user_data_formats = NULL
)

Arguments

files_table

path to csv file or data.frame object see example with 3 required columns and few optional:

required columns:

  • path - path to files

  • locality_id - unique locality id

  • data_format see mc_data_formats, names(mc_data_formats)

optional columns:

  • serial_number - logger serial number. If is NA, than myClim tries to detect serial number from file name (for TOMST) or header (for HOBO)

  • logger_type - type of logger. This defines individual sensors attributes (measurement heights and physical units) of the logger. Important when combining the data from multiple loggers on the locality. If not provided, myClim tries to detect loger_type from the source data file structure (applicable for HOBO, Dendro, Thermo and TMS), but automatic detection of TMS_L45 is not possible. Pre-defined logger types are: ("Dendro", "HOBO", "Thermo", "TMS", "TMS_L45") Default heights of sensor based on logger types are defined in table mc_data_heights

  • date_format A character vector specifying the custom date format(s) for the lubridate::parse_date_time() function (e.g., "%d.%m.%Y %H:%M:%S"). Multiple formats can be defined either in in CSV or in R data.frame using @ character as separator (e.g., "%d.%m.%Y %H:%M:%S@%Y.%m.%d %H:%M:%S"). The first matching format will be selected for parsing, multiple formats are applicable to single file.

  • tz_offset - If source datetimes aren't in UTC, then is possible define offset from UTC in minutes. Value in this column have the highest priority. If NA then auto detection of timezone in files. If timezone can't be detected, then UTC is supposed. Timezone offset in HOBO format can be defined in header. In this case function try detect offset automatically. Ignored for TOMST TMS data format (they are always in UTC)

  • step - Time step of microclimatic time-series in seconds. When provided, then used in mc_prep_clean instead of automatic step detection. See details.

localities_table

path to csv file ("c:/user/localities.table.csv") or R data.frame see example. Localities table is optional (default NULL). The locality_id is the only required column. Other columns are optional. Column names corresponding with the myclim pre-defined locality metadata (elevation, lon_wgs84, lat_wgs84, tz_offset) are associted withthose pre-defined metadata slots, other columns are written into metadata@user_data myClim-package.

required columns:

  • locality_id - unique locality id

optional columns:

  • elevation - elevation (in m)

  • lon_wgs84 - longitude (in decimal degrees)

  • lat_wgs84 - latitude (in decimal degrees)

  • tz_offset - locality time zone offset from UTC, applicable for converting time-series from UTC to local time.

  • ... - any other columns are imported to metadata@user_data

clean

if TRUE, then mc_prep_clean is called automatically while reading (default TRUE)

silent

if TRUE, then any information is not printed in console (default FALSE)

user_data_formats

custom data formats; use in case you have your own logger files not pre-defined in myClim - list(key=mc_DataFormat) mc_DataFormat (default NULL)

If custom data format is defined the key can be used in data_format parameter in mc_read_files() and mc_read_data(). Custom data format must be defined first, and then an be used for reading.

Details

The input tables could be R data.frames or csv files. When loading files_table and localities_table from external CSV they must have header, column separator must be comma ",". If you only need to place loggers to correct localities, files_table is enough. If you wish to provide localities additional metadata, you need also localities_table

By default, data are cleaned with the function mc_prep_clean see function description. mc_prep_clean detects gaps in time-series data, duplicated records, or records in the wrong order. Importantly, mc_prep_clean also applies a step parameter if provided. The step parameter can be used either instead of automatic step detection which can sometime failed, or to prune microclimatic data. For example, if you have a 15-minute time series but you wish to keep only one record per hour (without aggregating), you can use step parameter. However, if a step is provided and clean = FALSE, then the step is only stored in the metadata of myClim, and the time-series data is not cleaned, and the step is not applied.

Value

myClim object in Raw-format see myClim-package

See Also

mc_DataFormat

Examples

files_csv <- system.file("extdata", "files_table.csv", package = "myClim")
localities_csv <- system.file("extdata", "localities_table.csv", package = "myClim")
tomst_data <- mc_read_data(files_csv, localities_csv)

myClim documentation built on Oct. 21, 2024, 5:07 p.m.