fmi | R Documentation |
This function estimates the Fraction of Missing Information (FMI) for summary statistics of each variable, using either an incomplete data set or a list of imputed data sets.
fmi(data, method = "saturated", group = NULL, ords = NULL, varnames = NULL, exclude = NULL, fewImps = FALSE)
data |
Either a single |
method |
character. If |
group |
character. The optional name of a grouping variable, to request FMI in each group. |
ords |
character. Optional vector of names of ordered-categorical
variables, which are not already stored as class |
varnames |
character. Optional vector of variable names, to calculate
FMI for a subset of variables in |
exclude |
character. Optional vector of variable names to exclude from the analysis. |
fewImps |
logical. If |
The function estimates a saturated model with lavaan
for a single incomplete data set using FIML, or with lavaan.mi
for a list of imputed data sets. If method = "saturated"
, FMI will be
estiamted for all summary statistics, which could take a lot of time with
big data sets. If method = "null"
, FMI will only be estimated for
univariate statistics (e.g., means, variances, thresholds). The saturated
model gives more reliable estimates, so it could also help to request a
subset of variables from a large data set.
fmi
returns a list with at least 2 of the following:
Covariances |
A list of symmetric matrices: (1) the estimated/pooled
covariance matrix, or a list of group-specific matrices (if applicable) and
(2) a matrix of FMI, or a list of group-specific matrices (if applicable).
Only available if |
Variances |
The
estimated/pooled variance for each numeric variable. Only available if
|
Means |
The estimated/pooled mean for each numeric variable. |
Thresholds |
The estimated/pooled threshold(s) for each ordered-categorical variable. |
message |
A message indicating caution when the null model is used. |
Mauricio Garnier Villarreal (University of Kansas; mauricio.garniervillarreal@marquette.edu) Terrence Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Savalei, V. & Rhemtulla, M. (2012). On obtaining estimates of the fraction of missing information from full information maximum likelihood. Structural Equation Modeling, 19(3), 477–494. doi: 10.1080/10705511.2012.687669
Wagner, J. (2010). The fraction of missing information as a tool for monitoring the quality of survey data. Public Opinion Quarterly, 74(2), 223–243. doi: 10.1093/poq/nfq007
HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""), "ageyr","agemo","school")] set.seed(12345) HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5) age <- HSMiss$ageyr + HSMiss$agemo/12 HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9) ## calculate FMI (using FIML, provide partially observed data set) (out1 <- fmi(HSMiss, exclude = "school")) (out2 <- fmi(HSMiss, exclude = "school", method = "null")) (out3 <- fmi(HSMiss, varnames = c("x5","x6","x7","x8","x9"))) (out4 <- fmi(HSMiss, group = "school")) ## Not run: ## ordered-categorical data data(datCat) lapply(datCat, class) ## impose missing values set.seed(123) for (i in 1:8) datCat[sample(1:nrow(datCat), size = .1*nrow(datCat)), i] <- NA ## impute data m = 3 times library(Amelia) set.seed(456) impout <- amelia(datCat, m = 3, noms = "g", ords = paste0("u", 1:8), p2s = FALSE) imps <- impout$imputations ## calculate FMI, using list of imputed data sets fmi(imps, group = "g") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.