check_data: Check a Data Frame Prior to Analysis

Description Usage Arguments Details

Description

Original name: sEddyProc_initialize

Usage

1
2
check_data(data, site_name, vars, timestamp = "timestamp", dts = 48,
  char_cols = character(0), lat = NA, long = NA, tz = NA)

Arguments

data

A data frame with at least three month of (half-)hourly site-level data.

site_name

A string with the site identifier.

vars

An atomic vector of strings with selected column names. Tip: only select columns that are used in processing. Less columns = faster processing.

timestamp

A string indicating the column name with POSIX timestamp.

dts

An integer indicating the number of daily time steps (24 or 48).

char_cols

Names of columns that should not be checked for numeric type, e.g. season column.

lat

An integer indicating the site latitude in decimal degrees (-90 to +90).

long

An integer indicating the site longitude in decimal degrees (-90 to +90).

tz

An integer indicating the site time zone in offset from UTC, e.g. -5 for U.S. Eastern time.

Details

The timestamp must be provided in POSIX format, see also convert_time. For required properties of the time series, see check_times. Internally the half-hour timestamp is shifted to the middle of the measurement period (minus 15 minutes or 30 minutes).

All other columns may only contain numeric data. Please use NA as a gap flag for missing data or low quality data not to be used in the processing. The columns are also checked for plausibility with warnings if outside range.

There are several attributes set within the data frame: * site_name String for the site ID. * total_records Number of data rows. * daily_time_steps Number of daily time steps (24 or 48). * start_year * start_date * end_year * end_date * latitude * longitude * time_zone


grahamstewart12/tidyflux documentation built on June 4, 2019, 7:44 a.m.