summarise_by_period: Summaries by period

Description Usage Arguments Details Value TIMESTAMP_coll Examples

View source: R/metrics.R

Description

This function collapse the TIMESTAMP to the desired period (day, month...) by setting the same value to all timestamps within the period. This modified TIMESTAMP is used to group by and summarise the data.

Usage

1
summarise_by_period(data, period, .funs, ...)

Arguments

data

sapflow or environmental data as obtained by get_sapf_data and get_env_data. Must have a column named TIMESTAMP

period

period to collapse by. See sfn_metrics for details.

.funs

funs to summarise the data. See details.

...

optional arguments. See details

Details

This function uses internally .collapse_timestamp and summarise_all. Arguments to control these functions can be passed as '...'. Arguments for each function are spliced and applied when needed. Be advised that all arguments passed to the summarise_all function will be applied to all the summarising functions used, so it will fail if any of that functions does not accept that argument. To complex function-argument relationships, indicate each summary function call within the .funs argument as explained here summarise_all:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# This will fail beacuse na.rm argument will be also passed to the n function,
# which does not accept any argument:
summarise_by_period(
  data = get_sapf_data(ARG_TRE),
  period = '7 days',
  .funs = list(mean, sd, n()),
  na.rm = TRUE
)

# to solve this is better to use the .funs argument:
summarise_by_period(
  data = get_sapf_data(ARG_TRE),
  period = '7 days',
  .funs = list(~ mean(., na.rm = TRUE), ~ sd(., na.rm = TRUE), ~ n())
)

Value

A 'tbl_df' object with the metrics results. The names of the columns indicate the original variable (tree or environmental variable) and the metric calculated (i.e. 'vpd_mean'), separated by underscore

TIMESTAMP_coll

Previously to the collapsing step, a temporal variable called TIMESTAMP_coll is created to be able to catch the real timestamp when some events happens, for example to use the min_time function. If your custom summarise function needs to get the time at which some event happens, use TIMESTAMP_coll instead of TIMESTAMP for that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
    min_time <- function(x, time) {
      time[which.min(x)]
    }

    summarise_by_period(
      data = get_sapf_data(ARG_TRE),
      period = '1 day',
      .funs = list(~ min_time(., time = TIMESTAMP_coll)) # Not TIMESTAMP
    )
  

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(dplyr)

# data
data('ARG_TRE', package = 'sapfluxnetr')

# simple summary
summarise_by_period(
  data = get_sapf_data(ARG_TRE),
  period = '7 days',
  .funs = list(~ mean(., na.rm = TRUE), ~ sd(., na.rm = TRUE), ~ n())
)

sapfluxnetr documentation built on Aug. 28, 2020, 1:13 a.m.