Description Usage Arguments Details Value Functions Author(s) Examples
This function takes a categorical dataset as input (categories can be denoted by numbers) and returns a list of objects that will be used by the 'multilevelLCMI' function to perform the imputations.
A package for the multiple imputation of single-level and nested categorical data by means of Bayesian Multilevel Latent Class models.
1 |
dat |
Raw (categorical) data frame with missing data. It can also be a data matrix. The |
GID |
Group (level-2 unit) indicator (expressed as column number corresponding to the group ID in the dataset). It can be omitted in single-level datasets. |
UID |
Lower-level unit indicator (expressed as column number corresponding to the unit ID in the dataset). Optional. |
var2 |
Higher-level (group-specific) variables (expressed as a vector of column numbers in the dataset corresponding to the variables measured at the higher levels). Optional. |
Convert a raw categorical dataset with missing data into one ready to be imputed with the multilevelLCMI function. In particular, the function will transform factor variables into numeric ones, where numbers denote a different category. A coding list is returned along with the converted dataset.
'BMLCimpute' allows researchers and users of categorical datasets with missing data to perform Multiple Imputation via Bayesian latent class models.
Data can be either single- or multi-level. Model estimation and imputations are implemented via a Gibbs sampler run with the Rcpp package interface.
The function multilevelLCMI
performs the imputations. Prior to the imputation step, data should be processed with the function convData
; the
resulting list is then passed as input to the multilevelLCMI
. Complete datasets are obtained via the compData
function.
A convData
object, a list containing the following items:
convDat |
The converted dataset |
codLev1 |
List containing the new (and original) scores which will be used for the imputations (Level-1 variables). |
codLev1 |
Vector containing the number of categories observed for each variable (Level-1 variables). |
nCatLev1 |
Vector containing the number of categories observed for each variable (Level-1 variables). |
codLev2 |
List containing the new (and original) scores which will be used for the imputations (Level-2 variables). |
nCatLev2 |
List containing the new (and original) scores which will be used for the imputations (Level-2 variables). |
GroupIDs |
Matrix containing original and new Group ID's. |
GID |
The column Group ID number (as entered by the user). |
UID |
The column Unit ID number (as entered by the user). |
var2 |
The column numbers for level-2 variables (as entered in the input). |
doVar2 |
Boolean. Shall the BMLC model impute variables at level-2? (Result of |
namesLev1 |
Vector of variable names (level-1 variables). |
namesLev2 |
Vector of variable names (level-2 variables). |
GroupName |
Group ID variable name. |
CaseName |
Unit ID variable name. |
caseID |
Unit ID vector (re-permuted). |
sort_ |
Vector containing the original permutation of the dataset rows. |
multilevelLCMI
for the imputations and model estimation (internally calls Rcpp code);
convData
for data preparation (preprocessing);
compData
for dataset completion.
D. Vidotto <d.vidotto@uvt.nl>
BMLCimpute : Bayesian Multilevel Latent Class Models for the Multiple Imputation of Nested Categorical Data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ## Not run:
library(BMLCimpute)
# Load data
data(simul_incomplete)
# Preprocess the Data
cd <- convData(simul_incomplete, GID = 1, UID = 2, var2 = 8:12)
# Model Selection
set.seed(1)
mmLC <- multilevelLCMI( convData = cd, L = 10, K = 10, it1 = 1000, it2 = 3000, it3 = 100,
it.print = 250, v = 10, I = 0, pri2 = 1 / 10, pri1 = 1 / 15, priresp = 0.01,
priresp2 = 0.01, random = TRUE, estimates = FALSE, count = TRUE, plot.loglik = FALSE,
prec = 3, scale = 1.0)
# Select posterior maxima of the number of classes for the imputations
# (Other alternatives are possible, such as posterior modes or posterior quantiles)
L = max(which(mmLC[[12]] != 0))
K = max(apply(mmLC[[13]], 1, function(x) max(which( x != 0))), na.rm = TRUE)
# Perform 5 imutations on the dataset
mmLC <- multilevelLCMI( convData = cd, L = L, K = K, it1 = 2000, it2 = 4000, it3 = 100,
it.print = 250, v = 10, I = 5, pri2 = 500, pri1 = 50, priresp = 0.01, priresp2 = 0.01,
random = TRUE, estimates = FALSE, count = TRUE, plot.loglik = TRUE, prec = 4, scale = 1.0)
# Obtain the dataset completed with the first set of imputations (ind = 1)
complete_data = compData( convData = cd, implev1 = mmLC[[1]], implev2 = mmLC[[2]], ind = 1 )
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.