simulation: Generate Simulated Data

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Simulation functions allow the generation of simulated data suitable for model fitting with mixmod or mdmixmod from either a de novo set of parameters or the parameters of a fitted model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
simulateMixdata(n, distn, params)
simulateMdmixdata(n, distn, params, topology=LC_TOPOLOGY)
## S3 method for class 'mixmod'
simulateFromFit(x, n=x$N, ...)
## S3 method for class 'mdmixmod'
simulateFromFit(x, n=x$N, ...)
## S3 method for class 'mixmod'
resampleFromFit(x, n=x$N,
    replace=TRUE, hidpar=x$params$hidden, ...)
## S3 method for class 'mdmixmod'
resampleFromFit(x, n=x$N,
    replace=TRUE, hidpar=x$params$hidden, topology=x$topology, ...)
simulateCiTypeData(n=10000, par=LC_SIMPAR, topology=LC_TOPOLOGY)

Arguments

n

the length of the data to be simulated, that is, the number of independent simulated observations.

distn

the name of the distribution for simulateMixdata, or the vector of names of distributions for simulateMdmixdata, to be used in generating the observed portion of the data. Must be one of the distribution names (not family names) in LC_FAMILY.

params

the parameters of the complete data (hidden and observed) distributions; a list formatted as the params element of an object of class mixmod or mdmixmod.

topology

one of the model topologies in LC_TOPOLOGY, either "layered" or "chained".

x

an object of class mixmod or mdmixmod.

replace

logical; if TRUE, sample with replacement.

hidpar

hidden data simulation parameters.

par

marginal simulation parameters with the same structure as LC_SIMPAR.

...

currently unused.

Details

simulateMixdata and simulateMdmixdata are for generating data from de novo parameters, while simulateFromFit generates data from the parameters of a fitted model, and resampleFromFit generates hidden data from the parameters of a fitted model and uses the data to which that model was fitted to generate the observed data. The params argument must be a list formatted in the same way as the params element of a mixmod or mdmixmod object, such as is returned by mixmod or mdmixmod, respectively. However, in the case of simulating a multiple data model from a layered topology, that is, simulateMdmixdata(..., topology="layered"), the probz and rprob elements of params$hidden are not required.

simulateCiTypeData is a convenience function to generate simulated data with performance characteristics similar to those of CiData. The distribution is considerably simplified, but fitted models should show similar results, with either the layered or chained topology being favored depending on the topology chosen for the simulation.

Value

For simulateMixdata and simulateFromFit.mixmod, a list with elements:

X

observed data.

Y

hidden data.

For simulateMdmixdata, simulateFromFit.mdmixmod, and simulateCiTypeData, a list with elements:

X

observed data.

Y

intermediate-level hidden data, that is, observations on for Y_1, ..., Y_Z.

Y0

top-level hidden data, that is, observations on Y_0.

Author(s)

Daniel Dvorkin

See Also

mixmod, mdmixmod for specifications of the parameters; constants for distribution names and topologies; mvnorm, mvweisd, weisd, mvgamma for relevant distributions; rocinfo for performance evaluations using complete data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
## Not run: 

set.seed(123)

SimMarginal <- simulateMixdata(1000, distn="norm",
    params=list(hidden=list(prob=c(0.1, 0.9)),
                observed=list(mean=c(1, -1), sd=c(1, 2))))
SimMarginalFit <- mixmod(SimMarginal$X, 2)
SimMarginalFit
# Normal mixture model ('norm')
# Data 'SimMarginal$X' of length 1000 fitted to 2 components
# Model statistics:
#      iter      llik      qval       bic    iclbic 
#   406.000 -2115.747 -2302.603 -4266.032 -4639.745 
plot(rocinfo(SimMarginalFit, SimMarginal$Y==1))
                 

SimJoint <- simulateCiTypeData(10000) # layered topology
SimJointFits <- lapply(namedList("layered", "chained"), function(top)
    mdmixmod(SimJoint$X, c(2,3,2), topology=top))
SimJointFits
# $layered
# Layered (normal, normal, normal) mixture model ('norm', 'mvnorm', 'norm')
# Data 'SimJoint$X' of size 10000-by-(1,3,1) fitted to 2 (2,3,2) components
# Model statistics:
#       iter       llik       qval        bic     iclbic 
#      55.00  -72885.59  -85368.43 -146176.44 -171142.12 
# 
# $chained
# Chained (normal, normal, normal) mixture model ('norm', 'mvnorm', 'norm')
# Data 'SimJoint$X' of size 10000-by-(1,3,1) fitted to 2 (2,3,2) components
# Model statistics:
#       iter       llik       qval        bic     iclbic 
#      35.00  -72887.69  -82999.68 -146189.84 -166413.83 
SimJointMarginalFits <- marginals(SimJointFits$layered)
SimJointAllFits <- c(SimJointFits, SimJointMarginalFits)
plot(multiroc(SimJointAllFits, SimJoint$Y0==1))

## End(Not run)

lcmix documentation built on May 2, 2019, 6:49 p.m.