Relevant Dimension Estimation (RDE) by Fitting a Two-Component Model (TCM)

Share:

Description

The function estimates the relevant dimension in feature space by fitting a two-component model. It's also able to calculate a denoised version of the labels and to estimate the noise level in the data set.

Usage

1
2
3
4
5
6
7
rde_tcm(K, y,
        est_y = FALSE,
	alldim = FALSE,
	est_noise = FALSE,
	regression = FALSE,
	nmse = TRUE,
	dim_rest = 0.5)

Arguments

K

kernel matrix of the inputs (e.g. rbf kernel matrix)

y

label vector which contains the label for each data point

est_y

set this to TRUE if you want a denoised version of the labels

alldim

if this is TRUE denoised labels for all dimensions are calculated (instead of only for relevant dimension)

est_noise

set this to TRUE if you want an estimated noise level

regression

only interesting if one of est_y, alldim, est_noise is TRUE. Set this to TRUE if you want to force the function to handle the data as data for a regression problem. If you leave this FALSE, the function will try to determine itself whether this is a classification or regression problem.

nmse

only interesting if est_noise is TRUE and the function is handling the data as data of a regression problem. If you leave this TRUE, the normalized mean squared error is used for estimating the noise level, otherwise the conventional mean squared error.

dim_rest

percantage of leading dimensions to which the search for the relevant dimensions should be restricted. This is needed due to numerical instabilities. 0.5 should be a good choice in most cases (and is also the default value)

Details

If est_noise or alldim are TRUE, a denoised version of the labels for the relevant dimension will be returned even if est_y is FALSE (so e.g. if you want denoised labels and noise approximation it is enough to set est_noise to TRUE).

Value

rd

estimated relevant dimension

err

negative log-likelihood for each dimension (the position of the minimum is the relevant dimension)

yh

only returned if est_y, alldim or est_noise is TRUE, contains the denoised labels

Yh

only returned if alldim is TRUE, matrix with denoised labels for each dimension in each column

noise

only returned if est_noise is TRUE, contains the estimated noise level

kpc

kernel pca coefficients

eigvec

eigenvectors of the kernel matrix

eigval

eigenvalues of the kernel matrix

tcm

always TRUE; used to tell other functions that tcm method was used

Author(s)

Jan Saputra Mueller

References

M. L. Braun, J. M. Buhmann, K. R. Mueller (2008) \_On Relevant Dimensions in Kernel Feature Spaces\_

See Also

rde, rde_loocv, estnoise, isregression, rbfkernel, polykernel, drawkpc

Examples

1
2
3
4
5
6
7
8
## example with sinc data
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise
r <- rde_tcm(K, d$y, est_y = TRUE, est_noise = TRUE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients