estim_ncpMCA: Estimate the number of dimensions for the Multiple...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/estim_ncpMCA.R

Description

Estimate the number of dimensions for the Multiple Correspondence Analysis by cross-validation

Usage

1
estim_ncpMCA(don, ncp.min=0, ncp.max=5, nbsim=100, pNA=0.05, threshold=1e-4)

Arguments

don

a data.frame with categorical variables; with missing entries or not

ncp.min

integer corresponding to the minimum number of components to test

ncp.max

integer corresponding to the maximum number of components to test

nbsim

number of simulations

pNA

percentage of missing values added in the data set

threshold

the threshold for assessing convergence

Details

For the cross-validation, pNA percentage of missing values are removed at random and predicted with a MCA model using ncp.min to ncp.max dimensions. This process is repeated nbsim times. The number of components which leads to the smallest MSEP is retained. Each cell is predicted using the imputeMCA function, it means using the regularized iterative MCA algorithm.

Value

ncp

the number of components retained for the MCA

criterion

the criterion (the MSEP) calculated for each number of components

Author(s)

Francois Husson husson@agrocampus-ouest.fr and Julie Josse Julie.Josse@agrocampus-ouest.fr

References

Josse, J., Chavent, M., Liquet, B. and Husson, F. (2010). Handling missing values with Regularized Iterative Multiple Correspondence Analysis.

Examples

1
2
3
4
5
## Not run: 
data(vnf)
result <- estim_ncpMCA(vnf,ncp.min=0, ncp.max=3, nbsim=100)

## End(Not run)

missMDA documentation built on May 2, 2019, 5:46 p.m.