simulatey: Simulate the Study Variable

View source: R/simulatey.R

simulateyR Documentation

Simulate the Study Variable

Description

Simulate values for the study variable based on the auxiliary variable x and an assumed superpopulation model.

Usage

simulatey(x, f, g, dist = "normal", rho = NULL, Sigma = NULL, ...)

Arguments

x

a numeric vector giving the values of the auxiliary variable.

f

the name of the function defining the desired trend (see ‘Details’).

g

the name of the function defining the desired spread (see ‘Details’).

dist

the desired distribution of the study variable conditioned on the auxiliary variable. Either 'normal' or 'gamma' (see ‘Details’).

rho

a number giving the absolute value of the desired correlation between x and the vector to be simulated.

Sigma

a nonnegative number giving the scale of the spread term in the superpopulation model. Ignored if rho is given (see ‘Details’).

...

other arguments passed to f and g (see ‘Details’).

Details

The values of the study variable y are simulated using a superpopulation model defined as:

Y_{k}=f(x_{k})+\epsilon_{k}

with E(\epsilon_{k}) = 0, V(\epsilon_{k}) = \sigma^{2}g^{2}(x_{k}) and Cov(\epsilon_{k},\epsilon_{l}) = 0 if k\ne l. Also Y_{k}|f(x_{k}) is distributed according to dist.

f and g should return a vector of the same length of x. Their first argument should be x and they should not share the name of any other argument. Both f and g should have the ... argument (see ‘Examples’).

Note that Sigma defines the degree of association between x and y: the larger Sigma, the smaller the correlation, rho, and vice versa. For this reason only one of them should be defined. If both are defined, Sigma will be ignored.

Depending on the trend function f, some correlations cannot be reached. In those cases, Sigma will automatically be set to zero, dist will automatically be set to 'normal' and rho will be ignored (see ‘Examples’).

If the trend term takes negative values, dist will be automatically set to 'normal'.

Value

A numeric vector giving the simulated value of y associated to each value in x.

Examples

f<- function(x,b0,b1,b2,...) {b0+b1*x^b2}
g<- function(x,b3,...) {x^b3}

x<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )

#Linear trend and homocedasticity
y1<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=1,b3=0,rho=0.90)
y2<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=1,b3=0,rho=0.90)

#Linear trend and heterocedasticity
y3<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=1,b3=1,rho=0.90)
y4<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=1,b3=1,rho=0.90)

#Quadratic trend and homocedasticity
y5<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=2,b3=0,rho=0.80)

#Correlation of minus one
y6<- simulatey(x,f,g,dist="normal",b0=0,b1=-1,b2=1,b3=0,rho=1)

#Desired correlation cannot be attained
y7<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=3,b3=0,rho=0.99)

#Negative expectation not possible under gamma distribution
y8<- simulatey(x,f,g,dist="gamma",b0=0,b1=-1,b2=1,b3=0,rho=1) 

#Conditional variance of zero not possible under gamma distribution
y9<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=3,b3=0,rho=0.99)

optimStrat documentation built on Aug. 24, 2023, 9:09 a.m.