Description Usage Arguments Details Value Author(s) References See Also Examples
Computes nonparametric p-values for the potential class memberships of new observations. The p-values are based on 'k nearest neighbors'.
1 2 |
NewX |
data matrix consisting of one or several new observations (row vectors) to be classified. |
X |
matrix containing training observations, where each observation is a row vector. |
Y |
vector indicating the classes which the training observations belong to. |
k |
number of nearest neighbors. If |
distance |
the distance measure: |
cova |
estimator for the covariance matrix: |
Computes nonparametric p-values for the potential class memberships of new observations. Precisely, for each new observation NewX[i,]
and each class b
the number PV[i,b]
is a p-value for the null hypothesis that Y[i] = b.
This p-value is based on a permutation test applied to an estimated Bayesian likelihood ratio, using 'k nearest neighbors' with estimated prior probabilities N(b)/n. Here N(b) is the number of observations of class b and n is the total number of observations.
If k
is a vector, the program searches for the best k
. To determine the best k
for the p-value PV[i,b]
, the new observation NewX[i,]
is added to the training data with class label b
and then for all training observations with Y[j] != b
the proportion of the k
nearest neighbors of X[j,]
belonging to class b
is computed. Then the k
which minimizes the sum of these values is chosen.
If k = NULL
, it is set to 2:ceiling(length(Y)/2).
PV
is a matrix containing the p-values. Precisely, for each new observation NewX[i,]
and each class b
the number PV[i,b]
is a p-value for the null hypothesis that Y[i] = b.
If k
is a vector or NULL
, PV
has an attribute "opt.k"
, which is a matrix and opt.k[i,b]
is the best k
for observation NewX[i,]
and class b
(see section 'Details'). opt.k[i,b]
is used to compute the p-value for observation NewX[i,]
and class b
.
Niki Zumbrunnen niki.zumbrunnen@gmail.com
Lutz Dümbgen lutz.duembgen@stat.unibe.ch
www.imsv.unibe.ch/duembgen/index_ger.html
Zumbrunnen N. and Dümbgen L. (2017) pvclass: An R Package for p Values for Classification. Journal of Statistical Software 78(4), 1–19. doi:10.18637/jss.v078.i04
Dümbgen L., Igl B.-W. and Munk A. (2008) P-Values for Classification. Electronic Journal of Statistics 2, 468–493, available at http://dx.doi.org/10.1214/08-EJS245.
Zumbrunnen N. (2014) P-Values for Classification – Computational Aspects and Asymptotics. Ph.D. thesis, University of Bern, available at http://boris.unibe.ch/id/eprint/53585.
pvs, pvs.gaussian, pvs.wnn, pvs.logreg
1 2 3 4 5 |
setosa versicolor virginica
[1,] 0.90 0.02 0.02
[2,] 0.02 0.96 0.02
[3,] 0.02 0.04 0.20
attr(,"opt.k")
setosa versicolor virginica
[1,] 15 15 15
[2,] 15 15 15
[3,] 15 15 15
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.