createData: Simulate test data

Description Usage Arguments Examples

View source: R/createData.R

Description

This function creates synthetic dataset with various problems such as overdispersion, zero-inflation, etc.

Usage

1
2
3
4
5
6
createData(sampleSize = 100, intercept = 0, fixedEffects = 1,
  quadraticFixedEffects = NULL, numGroups = 10, randomEffectVariance = 1,
  overdispersion = 0, family = poisson(), scale = 1, cor = 0,
  roundPoissonVariance = NULL, pZeroInflation = 0, binomialTrials = 1,
  temporalAutocorrelation = 0, spatialAutocorrelation = 0,
  factorResponse = F, replicates = 1, hasNA = F)

Arguments

sampleSize

sample size of the dataset

intercept

intercept (linear scale)

fixedEffects

vector of fixed effects (linear scale)

quadraticFixedEffects

vector of quadratic fixed effects (linear scale)

numGroups

number of groups for the random effect

randomEffectVariance

variance of the random effect (intercept)

overdispersion

if this is a numeric value, it will be used as the sd of a random normal variate that is added to the linear predictor. Alternatively, a random function can be provided that takes as input the linear predictor.

family

family

scale

scale if the distribution has a scale (e.g. sd for the Gaussian)

cor

correlation between predictors

roundPoissonVariance

if set, this creates a uniform noise on the possion response. The aim of this is to create heteroscedasticity

pZeroInflation

probability to set any data point to zero

binomialTrials

Number of trials for the binomial. Only active if family == binomial

temporalAutocorrelation

strength of temporalAutocorrelation

spatialAutocorrelation

strength of spatial Autocorrelation

factorResponse

should the response be transformed to a factor (inteded to be used for 0/1 data)

replicates

number of datasets to create

hasNA

should an NA be added to the environmental predictor (for test purposes)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
testData = createData(sampleSize = 500, intercept = 2, fixedEffects = c(1), 
  overdispersion = 0, family = poisson(), quadraticFixedEffects = c(-3), 
  randomEffectVariance = 0)

par(mfrow = c(1,2))
plot(testData$Environment1, testData$observedResponse)
hist(testData$observedResponse)

# with zero-inflation

testData = createData(sampleSize = 500, intercept = 2, fixedEffects = c(1), 
  overdispersion = 0, family = poisson(), quadraticFixedEffects = c(-3), 
  randomEffectVariance = 0, pZeroInflation = 0.6)

par(mfrow = c(1,2))
plot(testData$Environment1, testData$observedResponse)
hist(testData$observedResponse)

# binomial with multiple trials

testData = createData(sampleSize = 40, intercept = 2, fixedEffects = c(1), 
                      overdispersion = 0, family = binomial(), quadraticFixedEffects = c(-3), 
                      randomEffectVariance = 0, binomialTrials = 20)

plot(observedResponse1 / observedResponse0 ~ Environment1, data = testData, ylab = "Proportion 1")


# spatial / temporal correlation

testData = createData(sampleSize = 100, family = poisson(), spatialAutocorrelation = 3, 
                      temporalAutocorrelation = 3)

plot(log(observedResponse) ~ time, data = testData)
plot(log(observedResponse) ~ x, data = testData)

Example output

This is DHARMa 0.3.3.0. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa') Note: Syntax of plotResiduals has changed in 0.3.0, see ?plotResiduals for details

DHARMa documentation built on Sept. 28, 2021, 5:10 p.m.