Description Usage Arguments Details Value Author(s) References Examples
View source: R/FHDI_CellProb.R
Calculate the joint cell probabilities for multivariate missing data using the expectation-maximization (EM) algorithm. This package is partially supported by the NSF grant CSSI 1931380.
| 1 | FHDI_CellProb(datz, w=NULL, id=NULL)
 | 
| datz | multivariate incomplete categorical data prepared by cell collapsing and merging algorithm. | 
| w | samping weight. Default = 1.0 if NULL. a scalar or w(nrow_y). | 
| id | index for each unit. Default = 1:nrow_y if NULL. | 
The joint cell probabilities are estimated using EM by weighting method. The algorithm computes the maximum likelihood estimates of the joint cell probabilities under missing at random assumption. Note that the variable reduction (ver. >= 1.4) with sure independence screening method is not applicable to a separate CellProb task. The input incomplete categorical data should be generated by cell make with the cell collapsing and merging algorithm.
| cellpr | table of the joint cell probability. The name of a cell is linked to the user-defined categories in "k": e.g., name "325" denotes 3rd, 2nd, 5th categories for three variables, respectively, whereas "a1c" denotes 10th, 1st, 12th categories. | 
| w | reprint of the sampling weights "w" initially defined by the user. | 
Dr. Cho, In Ho (maintainer) icho@iastate.edu Dr. Kim, Jae Kwang jkim@iastate.edu Dr. Im, Jong Ho ijh38@yonsei.ac.kr Yicheng Yang, Graduate Research Assistant
Im, J., Cho, I.H. and Kim, J.K. (2018). FHDI: An R Package for Fractional Hot-Deck Imputation. The R Journal. 10(1), pp. 140-154; Im, J., Kim, J.K. and Fuller, W.A. (2015). Two-phase sampling approach to fractional hot deck imputation, Proceeding of the Survey Research Methods Section, Americal Statistical Association, Seattle, WA.; Ibrahim, J.G. (1990). Incomplete data in generalized linear models. Journal of the American Statistical Assocation 85, 765-769.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ### Toy Example ### 
# y : trivariate variables
# r : indicator corresponding to missingness in y
set.seed(1345) 
n=100 
rho=0.5 
e1=rnorm(n,0,1) 
e2=rnorm(n,0,1) 
e3=rgamma(n,1,1) 
e4=rnorm(n,0,sd=sqrt(3/2))
y1=1+e1 
y2=2+rho*e1+sqrt(1-rho^2)*e2 
y3=y1+e3 
y4=-1+0.5*y3+e4
r1=rbinom(n,1,prob=0.6) 
r2=rbinom(n,1,prob=0.7) 
r3=rbinom(n,1,prob=0.8) 
r4=rbinom(n,1,prob=0.9)
y1[r1==0]=NA 
y2[r2==0]=NA 
y3[r3==0]=NA 
y4[r4==0]=NA
daty=cbind(y1,y2,y3,y4)
result_CM=FHDI_CellMake(daty, k=5, s_op_cellmake="merging", s_op_merge="fixed")
datz=result_CM$cell
result_CP=FHDI_CellProb(datz)
names(result_CP)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.