aggregate_statistics: Calculate a series of stastistics on aggregated data
In pegoraro/qchlorophyll: qchlorophyll

View source: R/statistics-calculation-functions.R

This function groups the observations and calculates a series of statistics.

aggregate_statistics(data, variable = "CHL1_mean", stat_funs = list(avg =
  "mean(., na.rm = TRUE)", NAs_count = "sum(is.na(.))", n_count = "n()"),
  groups = list("id_pixel", "id_date"), id = "id_pixel",
  unique_id = list("lat", "lon", "id_pixel"))

`data`	a dplyr data frame
`variable`	variable to use in the calculation of the statistics
`stat_funs`	a list of functions to calculate the statistics. Each function must be expressed either as a formula or as a character. For instance, if I would like to calculate the mean, I could set stat_funs = list(avg = "mean"). You can calculate multiple statistics too, for instance, to calculate both the mean and the standard deviation, just set stat_funs = list(avg = "mean", sd = "sd"). Note that: 1. The name of the function inside the list will be the name of the column of the statistic calculated in the output dataframe. 2. Should you need to pass arguments to each function such as "na.rm=TRUE" you can do this by setting list(avg = "mean(. , na.rm=TRUE)"). You can pass all the arguments needed to the functions inside stat_fun, provided you remember to add the dot as the first argument. 3. Each function must accept a vector as an input and output a single number. 4. By default the number of missing data per group (NAs_count = sum(is.na(.)))) and the number of observations in each group (n_count = n()) is calculated. Note that in case you need to filter out missing data with the function "filter_out_na", the filtering function expects the data to contain a variable named NAs_count containing the information on missing data.
`groups`	variables to group by (ie to aggregate by). A list.
`id`	The main unique identifier (id_pixel for instance). A character.
`unique_id`	list of all the unique identifiers in the data you would like to keep. (By default: list("lat","lon", "id_pixel")). A list