linear_SAMMC: Semi-Supervised Adaptive Maximum Margin Criterion
In kisungyou/Rdimtools: Dimension Reduction and Estimation Methods

do.sammc

R Documentation

Semi-Supervised Adaptive Maximum Margin Criterion

Description

Semi-Supervised Adaptive Maximum Margin Criterion (SAMMC) is a semi-supervised variant of AMMC by making use of both labeled and unlabeled data.

Usage

do.sammc(
  X,
  label,
  ndim = 2,
  type = c("proportion", 0.1),
  preprocess = c("center", "scale", "cscale", "decorrelate", "whiten"),
  a = 1,
  b = 1,
  lambda = 1,
  beta = 1
)

Arguments

`X`	an (n\times p) matrix or data frame whose rows are observations and columns represent independent variables.
`label`	a length-n vector of data class labels.
`ndim`	an integer-valued target dimension.
`type`	a vector of neighborhood graph construction. Following types are supported; `c("knn",k)`, `c("enn",radius)`, and `c("proportion",ratio)`. Default is `c("proportion",0.1)`, connecting about 1/10 of nearest data points among all data points. See also `aux.graphnbd` for more details.
`preprocess`	an additional option for preprocessing the data. Default is "center". See also `aux.preprocess` for more details.
`a`	tuning parameter for between-class weight in [0,∞).
`b`	tuning parameter for within-class weight in [0,∞).
`lambda`	balance parameter for between-class and within-class scatter matrices in (0,∞).
`beta`	balance parameter for within-class scatter of the labeled data and consistency of the whole data in (0,∞).

Value

a named list containing

Y: an (n\times ndim) matrix whose rows are embedded observations.
trfinfo: a list containing information for out-of-sample prediction.
projection: a (p\times ndim) whose columns are basis for projection.

Author(s)

Kisung You

References

\insertRef

lu_adaptive_2011Rdimtools

Examples

## generate data of 3 types with clear difference
set.seed(100)
dt1  = aux.gensamples(n=33)-50
dt2  = aux.gensamples(n=33)
dt3  = aux.gensamples(n=33)+50

## merge the data and create a label correspondingly
X      = rbind(dt1,dt2,dt3)
label  = rep(1:3, each=33)

## copy a label and let 20% of elements be missing
nlabel = length(label)
nmissing = round(nlabel*0.20)
label_missing = label
label_missing[sample(1:nlabel, nmissing)]=NA

## try different balancing
out1 = do.sammc(X, label_missing, beta=0.1)
out2 = do.sammc(X, label_missing, beta=1)
out3 = do.sammc(X, label_missing, beta=10)

## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, pch=19, col=label, main="SAMMC::beta=0.1")
plot(out2$Y, pch=19, col=label, main="SAMMC::beta=1")
plot(out3$Y, pch=19, col=label, main="SAMMC::beta=10")
par(opar)

kisungyou/Rdimtools documentation built on Jan. 2, 2023, 9:55 a.m.