Estimate noise level

Share:

Description

Estimates the noise level for a label vector 'y' and a denoised version of this label vector 'yh'. Which loss function is used to estimate the noise level depends on the kind of problem (regression problem or classification problem).

Usage

1
estnoise(y, yh, regression = FALSE, nmse = TRUE)

Arguments

y

a label vector containg only -1 and 1 for a classification problem, and real numbers in case of regression

yh

a denoised version of y which can be obtained by using e.g. rde

regression

FALSE in case of a classification problem, TRUE in case of a regression problem

nmse

if 'nmse' is TRUE and this is a regression problem, the mean squared error will be normalized

Details

In case of a classification problem, the 0-1-loss is used to estimate the noise level:

y = (y_1, ..., y_n)

L\_01(y, yh) = (1/n)*sum(y != yh)

In case of a regression problem, the mean squared error (mse) or the normalized mean squared error (nmse) is used, depending on whether 'nmse' is FALSE (mse) or TRUE (nmse):

L\_mse = (1/n)*sum( (y - yh)\^2 )

L\_nmse = L\_mse(y, yh) / ((1/n)*sum( (y - (1/n)*sum(y))\^2 )

Value

Estimated noise level

Author(s)

Jan Saputra Mueller

See Also

sincdata, rde_loocv, rde_tcm, rbfkernel, drawkpc

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## estimate noise of sinc data explicitly
d <- sincdata(100, 0.7) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
r <- rde(K, d$y, est_y = TRUE) # estimate relevant dimension
noise <- estnoise(d$y, r$yh, regression = TRUE) # estimate noise level

## estimate noise of sinc data implicitly (via rde_loocv)
d <- sincdata(100, 0.7) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
r <- rde(K, d$y, est_y = TRUE) # estimate relevant dimension AND estimate noise
r$noise # estimated noise level