Relevant Dimension Estimation (RDE)

Share:

Description

The function estimates the relevant dimension in feature space. By default, this is done by fitting a two-component model, but rde by leave-one-out cross-validation is also available. The function is also able to calculate a denoised version of the labels and to estimate the noise level in the data set.

Usage

1
2
3
4
5
6
7
8
rde(K, y,
    est_y = FALSE,
    alldim = FALSE,
    est_noise = FALSE,
    regression = FALSE,
    nmse = TRUE,
    dim_rest = 0.5,
    tcm = TRUE)

Arguments

K

kernel matrix of the inputs (e.g. rbf kernel matrix)

y

label vector which contains the label for each data point

est_y

set this to TRUE if you want a denoised version of the labels

alldim

if this is TRUE denoised labels for all dimensions are calculated (instead of only for relevant dimension)

est_noise

set this to TRUE if you want an estimated noise level

regression

only interesting if one of est_y, alldim, est_noise is TRUE. Set this to TRUE if you want to force the function to handle the data as data for a regression problem. If you leave this FALSE, the function will try to determine itself whether this is a classification or regression problem.

nmse

only interesting if est_noise is TRUE and the function is handling the data as data of a regression problem. If you leave this TRUE, the normalized mean squared error is used for estimating the noise level, otherwise the conventional mean squared error.

dim_rest

percantage of leading dimensions to which the search for the relevant dimensions should be restricted. This is needed due to numerical instabilities. 0.5 should be a good choice in most cases (and is also the default value)

tcm

this is TRUE by default; indicates whether rde should be done by TCM or LOO-CV algorithm

Details

If est_noise or alldim are TRUE, a denoised version of the labels for the relevant dimension will be returned even if est_y is FALSE (so e.g. if you want denoised labels and noise approximation it is enough to set est_noise to TRUE).

Value

rd

estimated relevant dimension

err

loo-cv-error/negative-log-likelihood-value for each dimension (the position of the minimum is the relevant dimension)

yh

only returned if est_y, alldim or est_noise is TRUE, contains the denoised labels

Yh

only returned if alldim is TRUE, matrix with denoised labels for each dimension in each column

noise

only returned if est_noise is TRUE, contains the estimated noise level

kpc

kernel pca coefficients

eigvec

eigenvectors of the kernel matrix

eigval

eigenvalues of the kernel matrix

tcm

TRUE if TCM algorithm was used, otherwise (LOO-CV algorithm) FALSE

Author(s)

Jan Saputra Mueller

References

M. L. Braun, J. M. Buhmann, K. R. Mueller (2008) \_On Relevant Dimensions in Kernel Feature Spaces\_

See Also

rde_loocv, rde_tcm, estnoise, isregression, rbfkernel, polykernel, drawkpc

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## example with sinc data using tcm algorithm
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise, fit tcm
r <- rde(K, d$y, est_y = TRUE, est_noise = TRUE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients

## example with sinc data using loo-cv algorithm
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise
r <- rde(K, d$y, est_y = TRUE, est_noise = TRUE, tcm = FALSE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients