tuneCluster.spca: Feature Selection Optimization for sPCA method

Description Usage Arguments Details Value Examples

View source: R/tuneCluster.spca.R

Description

This function identify the number of feautures to keep per component and thus by cluster in mixOmics::spca by optimizing the silhouette coefficient, which assesses the quality of clustering.

Usage

1
tuneCluster.spca(X, ncomp = 2, test.keepX = rep(ncol(X), ncomp), ...)

Arguments

X

numeric matrix (or data.frame) with features in columns and samples in rows

ncomp

integer, number of component to include in the model

test.keepX

vector of integer containing the different value of keepX to test for block X.

...

other parameters to be included in the spls model (see mixOmics::spca)

Details

For each component and for each keepX value, a spls is done from these parameters. Then the clustering is performed and the silhouette coefficient is calculated for this clustering.

We then calculate "slopes" where keepX are the coordinates and the silhouette is the intensity. A z-score is assigned to each slope. We then identify the most significant slope which indicates a drop in the silhouette coefficient and thus a deterioration of the clustering.

Value

silhouette

silhouette coef. computed for every combinasion of keepX/keepY

ncomp

number of component included in the model

test.keepX

list of tested keepX

block

names of blocks

slopes

"slopes" computed from the silhouette coef. for each keepX and keepY, used to determine the best keepX and keepY

choice.keepX

best keepX for each component

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
demo <- suppressWarnings(get_demo_cluster())
X <- demo$X

# tuning
tune.spca.res <- tuneCluster.spca(X = X, ncomp = 2, test.keepX = c(2:10))
keepX <- tune.spca.res$choice.keepX
plot(tune.spca.res)

# final model
spca.res <- mixOmics::spca(X=X, ncomp = 2, keepX = keepX)
plotLong(spca.res)

timeOmics documentation built on Nov. 8, 2020, 10:58 p.m.