imputeCA: Impute contingency table
In missMDA: Handling Missing Values with Multivariate Data Analysis

imputeCA

R Documentation

Impute contingency table

Description

Impute the missing entries of a contingency table using Correspondence Analysis (CA). Can be used as a preliminary step before performing CA on an incomplete dataset.

Usage

imputeCA(X, ncp = 2, threshold = 1e-08, maxiter = 1000, row.sup=NULL, 
     col.sup=NULL, quanti.sup=NULL, quali.sup=NULL)

Arguments

`X`	a data.frame that is a contingency table containing missing values
`ncp`	integer corresponding to the number of dimensions used to predict the missing entries
`threshold`	the threshold for assessing convergence
`maxiter`	integer, maximum number of iterations for the regularized iterative CA algorithm
`row.sup`	a vector indicating the indexes of the supplementary rows
`col.sup`	a vector indicating the indexes of the supplementary columns
`quanti.sup`	a vector indicating the indexes of the quantitative supplementary variables
`quali.sup`	a vector indicating the indexes of the categorical supplementary variables

Details

Impute the missing entries of a contingency table using a regularized CA algorithm. The (regularized) iterative CA algorithm first consists in initializing missing values with random initial values. The second step of the (regularized) iterative CA algorithm consists in performing CA on the completed dataset. Then, it imputes the missing values with the (regularized) reconstruction formulae of order ncp (the fitted matrix computed with ncp components for the (regularized) scores and loadings). These steps of estimation of the parameters via CA and imputation of the missing values using the (regularized) fitted matrix are iterate until convergence.
In this regularized algorithm, the singular values of the CA are shrinked.
The number of components ncp used in the algorithm should be small. A small number of components can also be seen as a way to regularize more and consequently may be advices to get more stable predictions.
The output of the algorithm can be used as an input of the CA function of the FactoMineR package in order to perform CA on an incomplete dataset.

Value

The imputed contingency table; the observed values are kept for the non-missing entries and the missing values are replaced by the predicted ones.

Author(s)

Francois Husson francois.husson@institut-agro.fr and Julie Josse julie.josse@polytechnique.edu

Examples

## Not run: 
data(children)

## Impute the indicator matrix and perform a CA
res.impute <- imputeCA(children, ncp=2)
res.ca <- CA(res.impute) 

## End(Not run)

missMDA documentation built on Nov. 17, 2023, 5:07 p.m.