cv.nfeaturesLDA: Cross-validation to find the optimum number of features...
In animation: A Gallery of Animations in Statistics and Utilities to Create Animations

Description Usage Arguments Details Value Author(s) References See Also

This function provids an illustration of the process of finding out the optimum number of variables using k-fold cross-validation in a linear discriminant analysis (LDA).

cv.nfeaturesLDA(
  data = matrix(rnorm(600), 60),
  cl = gl(3, 20),
  k = 5,
  cex.rg = c(0.5, 3),
  col.av = c("blue", "red"),
  ...
)

`data`	a data matrix containg the predictors in columns
`cl`	a factor indicating the classification of the rows of `data`
`k`	the number of folds
`cex.rg`	the range of the magnification to be used to the points in the plot
`col.av`	the two colors used to respectively denote rates of correct predictions in the i-th fold and the average rates for all k folds
`...`	arguments passed to `points` to draw the points which denote the correct rate

For a classification problem, usually we wish to use as less variables as possible because of difficulties brought by the high dimension.

The selection procedure is like this:

Split the whole data randomly into k folds:
- For the number of features g = 1, 2, ..., gmax, choose g features that have the largest discriminatory power (measured by the F-statistic in ANOVA):
  - For the fold i (i = 1, 2, ..., k):
    - Train a LDA model without the i-th fold data, and predict with the i-th fold for a proportion of correct predictions p[gi];
- Average the k proportions to get the correct rate p[g];
Determine the optimum number of features with the largest p.

Note that g_{max} is set by ani.options('nmax') (i.e. the maximum number of features we want to choose).

A list containing

`accuracy`	a matrix in which the element in the i-th row and j-th column is the rate of correct predictions based on LDA, i.e. build a LDA model with j variables and predict with data in the i-th fold (the test set)
`optimum`	the optimum number of features based on the cross-validation

Yihui Xie <https://yihui.org/>

Examples at https://yihui.org/animation/example/cv-nfeatureslda/

Maindonald J, Braun J (2007). Data Analysis and Graphics Using R - An Example-Based Approach. Cambridge University Press, 2nd edition. pp. 400

kfcv, cv.ani, lda

animation documentation built on Oct. 7, 2021, 9:18 a.m.

animation index

Package overview README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

animation
A Gallery of Animations in Statistics and Utilities to Create Animations

cv.nfeaturesLDA: Cross-validation to find the optimum number of features...
In animation: A Gallery of Animations in Statistics and Utilities to Create Animations

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Related to cv.nfeaturesLDA in animation...

R Package Documentation

Browse R Packages

We want your feedback!

animation A Gallery of Animations in Statistics and Utilities to Create Animations

cv.nfeaturesLDA: Cross-validation to find the optimum number of features... In animation: A Gallery of Animations in Statistics and Utilities to Create Animations

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Related to cv.nfeaturesLDA in animation...

R Package Documentation

Browse R Packages

We want your feedback!

animation
A Gallery of Animations in Statistics and Utilities to Create Animations

cv.nfeaturesLDA: Cross-validation to find the optimum number of features...
In animation: A Gallery of Animations in Statistics and Utilities to Create Animations