Class "Rcpp_Lcm"

Description

This class implements the MCMC sampler for non-parametric imputation of discrete multivariate data described in Manrique-Vallier and Reiter (2014). It provides methods for updating and monitoring the sampler.

Details

Rcpp_lcm objects should be created with CreateModel. Please see the examples in the demo folder for more detailed explanation on model fitting and parameter tracing.

Extends

Class "C++Object", directly.

All reference classes extend and inherit methods from "envRefClass".

Fields

CurrentIteration:

the total number of iterations that have been run so far.

EnableTracer:

to check tracer status or to enable/disable the tracer.

MCZ:

the disjointed structural zero matrix.

snapshot:

retrieve a list with the current state of all the parameters in the sampler, including the imputed sample. A call the the "snapshot" method returns a list with the following components:

alpha:

the concentration parameter of the stick breaking prior.

k_star:

the effective number number of latent classes (mixture components)

Nmis:

the size of the augmented sample.

nu:

a vector with the mixture weights

z:

a matrix with the current latent class assignment of each member of the sample

ImputedX:

the current raw imputed dataset. Use GetDataFrame to convert the raw data to a data frame of factors as defined in the input data set.

psi:

The conditional multinomial probabilties. A Lmax * K * J array, where Lmax is the maximum number of levels of all discrete factors in the dataset, J is the number of factors in the dataset, and K is the number of latent classes. Since variables might have different numbers of levels, unused entries in the first dimension are filled with NAs to complete Lmax.

traceable:

list of model parameters that can be traced by the tracer.

traced:

list of model parameters that are traced.

Methods

SetTrace(paralist,num_of_iterations):

set parameters to be traced.

paralist:

a list of parameters to be traced.

num_of_iterations:

the maximum number of traced iterations.

Run(burnin, iter, thinning):

run MCMC iterations.

burnin:

number of burn in iterations.

iter:

number of MCMC iterations.

thinning:

thinning parameter.

Resume():

resume from an interrupted call to run method.

Parameters(paralist):

retrieve a selected list of model parameters from last MCMC iteration.

paralist:

a list of parameters to be traced.

GetTrace():

retrieve all traced iterations. Returns a list with all the parameters set using the method SetTrace(). See description of snapshotreference method for a description of the parameters.

References

Manrique-Vallier, D. and Reiter, J.P. (2013), "Bayesian Estimation of Discrete Multivariate Latent Structure Models with Structural Zeros", JCGS.

Si, Y. and Reiter, J.P. (2013), "Nonparametric Bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys", Journal of Educational and Behavioral Statistics, 38, 499 - 521

Manrique-Vallier, D. and Reiter, J.P. (2014), "Bayesian Multiple Imputation for Large-Scale Categorical Data with Structural Zeros", Survey Methodology.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
require(NPBayesImpute)
#Please use NYexample data set for a more realistic example
data('NYMockexample')

#create the model
model <- CreateModel(X,MCZ,10,10000,0.25,0.25)

#run 1 burnins, 2 mcmc iterations and thin every 2 iterations
model$Run(1,2,2)

#retrieve parameters from the final iteration
result <- model$snapshot

#convert ImputedX matrix to dataframe, using proper factors/names etc.
ImputedX <- GetDataFrame(result$ImputedX,X)
#View(ImputedX)