tdmPrePCA.train: PCA (Principal Component Analysis) for numeric columns in a...

Description Usage Arguments Value Note Author(s) See Also

View source: R/tdmPreprocUtils.r

Description

tdmPrePCA.train is capable of linear PCA, based on prcomp (which uses SVD), and of kernel PCA (either KPCA, KHA or KFA).

Usage

1
tdmPrePCA.train(dset, opts)

Arguments

dset

the data frame with training (and test) data.

opts

a list from which we need here the following entries:

  • PRE.PCA: ["linear" | "kernel" | "none" ]

  • PRE.knum: if >0 and if PRE.PCA="kernel", take only a subset of PRE.knum records from dset

  • PRE.PCA.REPLACE: [T] =T: replace the original numerical columns with the PCA columns; =F: add the PCA columns

  • PRE.PCA.npc: if >0, then add for the first PRE.PCA.npc PCs the monomials of degree 2 (see tdmPreAddMonomials)

  • PRE.PCA.numericV vector with all column names in dset for which PCA is performed. These columns may contain *numeric* values only.

Value

pca, a list with entries:

dset

the input data frame dset with columns numeric.variables replaced or extended (depending on opts$PRE.PCA.REPLACE) by the PCs with names PC1, PC2, ... (in case PRE.PCA=="linear") or with names KP1, KP2, ... (in case PRE.PCA=="kernel") and optional with monomial columns added, if PRE.PCA.npc>0. The number of PCs is min(nrows(dset),length(numeric.variables)).

numeric.variables

the new numeric column names (PCs, monomials, and optionally old numericV, if opts$PRE.PCA.REPLACE==F)

pcaList

a list with the items sdev, rotation, center, scale, x as returned from prcomp plus eigval, the eigenvalues for the PCs

Note

CAUTION: Kernel PCA (opts$PRE.PCA=="kernel") is currently disabled in code, it *crashes* for large number of records or large number of columns.

Author(s)

Wolfgang Konen, FHK, Mar'2011 - Jan'2012

See Also

tdmPrePCA.apply


TDMR documentation built on March 3, 2020, 1:06 a.m.