aggregate_df: Aggregating timeseries to various resolutions

Description Usage Arguments Examples

View source: R/aggregate_df.R

Description

This function allows you to aggregate a data.frame with a column formated as 'Date' or 'POSIXct' to different time resolutions by applying a specific function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
aggregate_df(
  df,
  value.cols = NULL,
  round = "date",
  fn = "mean",
  drop.duplicates = T,
  group.thresh = 0.8,
  datecol = NULL,
  na.action = "keep",
  max.gap = 3,
  timestep = NULL
)

Arguments

df

data.frame, A data.frame that should be aggregated

value.cols

string, one or multiple column names that should be used to calculate the new values. If not supplied, all numeric columns will be used

round

string, one of 'hour', 'date', 'month' or 'season' specifying the time resolution of the output

fn

string, name of the function that is used to aggregate the value.cols. If you want to supply multiple functions, pass them as character vector.

drop.duplicates

logical, Should duplicated entries in the raw data.frame be dropped? The group.cols and the rounded Datetime column will be used to check for duplicates

group.thresh

double, Size in percentage that every group has to have compared to the group with the largest size. Groups smaller than this threshold will be dropped.

na.action

string, one of 'keep', 'ignore' or 'fill'. For 'fill', missing values will be filled up to max.gap using zoo::na.approx().

max.gap

integer, number of consecutive nas to fill with zoo::na.approx().

group.cols

string, column names that are used to group df. Can be used to aggregate the data.frame into multiple levels for example by specifying the date as group.col and hour as round argument will produce a dataframe with hourly values for every date. If not supplied, no grouping is performed

datecol.name

string, name of the column that should be used to extract the date, hour or month to aggregate the dataframe. If not supplied, the function will try to automatically detect this column, by looking first for a column in POSIXct format and then for a column in date format.

Examples

1
2
3
4
5
fit_linear_model_groups(df = df.interp,
  frml = huglin ~ elevation,
  predictor.raster = dem.st,
  file.name = data/huglin.tif,
  set.zero = T)

sitscholl/rebecka_package documentation built on Aug. 25, 2020, 4:20 a.m.