get_subsample_definitions: Retrieve descriptive data for samples from literature

View source: R/subsample_data.r

get_subsample_definitionsR Documentation

Retrieve descriptive data for samples from literature

Description

In the WoodSimulatR package, means and standard deviations of grade determining properties (GDPs) for a number of Norway spruce (Picea abies) samples from literature are stored for use in simulate_dataset. They are indexed by a two-letter country code (and a suffixed number if disambiguation is required).

Usage

get_subsample_definitions(country = NULL, loadtype = "t", species = "PCAB")

Arguments

country

Can be either the number of desired samples, or a named vector of relative subsample sizes where the names can be abbreviations of country names. Alternatively, country can also be a character vector of country abbreviations.

loadtype

Can be either "be" for "bending edgewise" or "t" for "tension".

species

A species code according to EN 13556:2003. Currently, only 'PCAB' (Picea abies = Norway spruce) is supported.

Details

The direct descriptive data can also be directly accessed (gdp_data). The present function is meant to prepare the data as input to the subsets argument of simulate_dataset. It allows picking multiple samples from the same country (disambiguating by creating appropriately named entries in the column subsample) and creating random sample data (uniformly distributed within the range of values given in the full dataset gdp_data for the respective loadtype and species) for sample names not found in this dataset.

The dataset gdp_data contains a column share which gives the number of pieces in the original sample. Unless relative subsample sizes are explicitly asked for by providing a named numeric vector for the argument country, the present function always resets share to 1, prompting simulate_dataset to create (approximately) equal-sized subsamples.

The GDPs depend on the type of destructive testing done (loadtype) – therefore, giving the proper loadtype is required for realistic values.

If country is NULL (or omitted), the full dataset gdp_data for the respective loadtype (and species) is returned.

For sample names not contained in the internal list, a warning is issued and random sample data is returned (uniformly distributed within the range of values given in the full table for the respective loadtype and species).

If country is just a number (and not a named vector), also random sample data is returned; the different "countries" are then named "C1", "C2" and so on.

Value

A data frame with country and subsample names, relative subsample sizes and some meta-information like project and literature references, as well as mean strength and standard deviation of strength, static modulus of elasticity and density.

Notes

The GDP values collected in gdp_data were selected from publications which aimed at representative sampling within the respective countries. All the same, care must be taken when using these values, due to the natural high variability of timber properties.

Examples


# get all subsample data for loadtype bending, or tension
get_subsample_definitions()
get_subsample_definitions(loadtype='be')

# get six random samples, explicitly state loadtype tension
get_subsample_definitions(country=6, loadtype='t')

# get subsample data for the German tension sample in different ways
get_subsample_definitions(country='de', loadtype='t')
get_subsample_definitions(country=c(de=1), loadtype='t')
get_subsample_definitions(country=c(de=6), loadtype='t')

# bending samples from Sweden (both samples), Poland, and France, equally
# weighted
get_subsample_definitions(c('se', 'se_1', 'pl', 'fr'))
get_subsample_definitions(c(se=1, se_1=1, pl=1, fr=1))
get_subsample_definitions(c(se=5, se_1=5, pl=5, fr=5))

# four tension samples from Romania, two from Ukraine and one from Slovakia,
# weighted so that each country contributes equally
get_subsample_definitions(c(ro=1, ro=1, ro=1, ro=1, ua=2, ua=2, sk=4), loadtype='t')

# non-existant subsample names get replaced by random values (which are based
# on the range of stored values for the respective loadtype)
get_subsample_definitions(c('xx', 'yy', 'zz'))
get_subsample_definitions(c('xx', 'yy', 'zz'), loadtype='t')

# subsample names are case-sensitive!
get_subsample_definitions(c('at', 'aT', 'At', 'AT'), loadtype='t')


WoodSimulatR documentation built on June 20, 2022, 9:05 a.m.