Impute | R Documentation |
This function imputes the data using two methods.
method 'Normal' - Imputes the data assuming that the data come from a multivariate normal distribution with mean mu and covariance sig. If mu or sig are not inputted, then their maximum likelihood estimate is used. The imputed values are based on the conditional distribution of the missing given the observed and mu and sigma; see Jamshidian and Jalal (2010) for more details.
method 'Dist.Free' - This method imputes the data nonparametrically using the method of Sirvastava and Dolatabadi (2009). Also see Jamshidian and Jalal (2010).
Impute(data, mu = NA, sig = NA, imputation.method = "Normal", resid = NA)
data |
A matrix consisting of at least two columns. Values must be numerical with missing data indicated by NA. |
mu |
A vector, consisting of population means, used to impute the data. As a default the maximum likelihood estimates based on the observed data is used. |
sig |
The population covariance matrix used to impute the data. As a default the maximum likelihood estimates based on the observed data is used. |
imputation.method |
'Normal' uses the normal imputation method. 'Dist.free uses the the method. See Jamshidian and Jalal (2010) and Sirvastava and Dolatabadi (2009). |
resid |
User defined residual vector to be used in place of the residuals proposed by the Sirvastava and Dolatabadi (2009) method. |
This routine uses OrderMissing to order data accordinng to missing data patterns. The output consists of imputed data both in its original order as well as post ordering by OrderMissing.
yimp |
The imputed data set (in the order of the original data) after rwos with no datum (if any) have been deleted. |
yimpOrdered |
The imputed data set ordered by OrderMissing according to missing data pattern |
caseorder |
A mapping of case number indices from OrderedData to the original data. More specifically, the j-th row of the OrderedData is the caseorder[j]-th (the j-th element of caseorder) row of the original data. |
patused |
A matrix indicating the missing data patterns in the data set, using 1's' (for observed) and NA's (for missing). |
patcnt |
A vector consisting the number of cases corresponding to each pattern in patused. |
In the above descriptions "original data" refers to the input data after deletion of the rows consisting of all NA's (if any) .
Mortaza Jamshidian, Siavash Jalal, and Camden Jansen
Srivastava, M. S. and Dolatabadi, M. (2009). “Multiple imputation and other resampling scheme for imputing missing observations,” Journal of Multivariate Analysis, 100, 1919-1937, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jmva.2009.06.003")}.
Jamshidian, M. and Jalal, S. (2010). “Tests of homoscedasticity, normality, and missing at random for incomplete multivariate data,” Psychometrika, 75, 649-674, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s11336-010-9175-3")}.
set.seed <- 50
n <- 200
p <- 4
pctmiss <- 0.2
y <- matrix(rnorm(n * p),nrow = n)
missing <- matrix(runif(n * p), nrow = n) < pctmiss
y[missing] <- NA
yimp1 <- Impute(data=y, mu = NA, sig = NA, imputation.method = "Normal", resid = NA)
yimp2 <- Impute(data=y, mu = NA, sig = NA, imputation.method = "Dist.Free", resid = NA)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.