simbase_covar: Calculate reference data for simulating values based on a...

View source: R/statistical_simulation.r

simbase_covarR Documentation

Calculate reference data for simulating values based on a covariance matrix approach

Description

Given the covariance matrix and the means of a set of variables, we can simulate not only the distribution of the variables, but also their correlations. The present function calculates the basic values required for the simulation and returns them packed into an object of class simbase_covar.

Usage

simbase_covar(
  data,
  variables = NULL,
  transforms = list(),
  label = simbase_labeler,
  ...
)

Arguments

data

The dataset for the calculation of the reference data for simulation; for grouped datasets (see group_by), the reference data is calculated for each group separately (see also simbase_list).

variables

Character vector containing the names in data which should be included in the simulation. If missing, all numeric variables in data are used.

transforms

A named list of objects of class trans (see function trans_new in package scales); the name of each list entry must correspond to a variable name in variables.

label

Either a string describing the data and the simulation approach, or a labelling function which returns a label string and takes as input the data, a string giving the class of the simbase object (here "simbase_covar") and the transforms list.

...

Arguments to be passed on to simbase_list (if it is called).

Details

If some of the variables are non-normally distributed, a transform may improve the prediction. The transforms are passed to the function as a named list, where the name of a list entry must correspond to the name of the variable in the data which is to be transformed.

Predefined transforms can be found in the package scales, where they are used for axis transformations as a preparation for plotting. The package scales also contains a function trans_new which can be used to define new transforms.

In the context of destructively measured sawn timber properties, the type of destructive test applied is of interest. If the dataset data contains a variable loadtype which consistently throughout the dataset has either the value "t" (i.e. all sawn timber has been tested in tension) or the value "be" (i.e. all sawn timber has been tested in bending, edgewise), then the returned object also has a field loadtype with that value.

One can also calculate a simbase under the assumption that the correlations are different for different subgroups of the data. This is done by grouping the dataset data prior to passing it to the function, using group_by. In this case, several objects of class simbase_covar are created and joined together in a tibble – see also simbase_list.

Value

An S3 object of class simbase_list if data is grouped, and an object of class simbase_covar otherwise.

Examples

# obtain a dataset for demonstration
dataset <- simulate_dataset();

# calculate a simbase without transforms
simbase_covar(dataset, c('f', 'E', 'rho', 'E_dyn'));

# calcuate a simbase with log-transformed f
simbase_covar(dataset, c('f', 'E', 'rho', 'E_dyn'), list(f = scales::log_trans()));

# if we group the dataset, we get a simbase_list object
simbase_covar(dplyr::group_by(dataset, country), c('f', 'E', 'rho', 'E_dyn'));


WoodSimulatR documentation built on June 20, 2022, 9:05 a.m.