bivar: Bias-Variance Decomposition of the Misclassification Rate
In schiffner/biVar: Bias-Variance Analysis of the Misclassification Rate

Description Usage Arguments Details Value References

View source: R/bivar.R

Computes the bias-variance decomposition of the misclassification rate according to the approaches of James (2003) and Domingos (2000).

bivar(y, ...)

## S3 method for class 'data.frame'
bivar(y, ...)

## Default S3 method:
bivar(y, grouping, ybayes, posterior, ybest = NULL, ...)

`y`	Predicted class labels on a test data set based on multiple training data sets. For the default method `y` is supposed to be a `list` where each element contains the predictions for one single test observation. The list elements are supposed to be `factor`s with the same levels as `grouping`. `y` can also be a `data.frame` where the rows correspond to test observations and the columns correspond to predictions on these test observations based on the different training sets.
`grouping`	Vector of true class labels (a `factor`).
`ybayes`	(Optional.) Bayes prediction (a `factor` with the same levels as `grouping`). Ignored if `posterior` is specified as `ybayes` can be easily calculated from the posterior probabilities.
`posterior`	(Optional.) Matrix of posterior probabilities, either known or estimated. It is assumed that the columns are ordered according to the factor levels of `grouping`.
`ybest`	Prediction from the best fitting model on the whole population (a `factor` with the same levels as `grouping`). Used for calculation of model and estimation bias as well as systematic model effect and systematic estimation effect.
`...`	Currently unused.

If posterior is specified, ybayes is calculated from the posterior probabilities and the posteriors are used to calculate/estimate noise, the misclassification rate, systematic effect and variance effect. If ybayes is specified it is ignored if posterior is given. Otherwise the empirical distribution of ybayes is inferred and used to calculate the quantities of interest. If neither posterior nor ybayes are specified it is assumed that the noise level is zero and the remaining quantities are calculated based on this supposition.

A data.frame with the following columns:

`error`	Estimated misclassification probability.
`noise`	(Only if `ybayes` or `posterior` was specified.) Noise or Bayes error rate.
`bias`	Bias.
`model.bias`	(Only if `ybest` was specified.) Model bias.
`estimation.bias`	(Only if `ybest` was specified.) Estimation bias.
`variance`	Variance.
`unbiased.variance`	Unbiased variance.
`biased.variance`	Biased variance.
`net.variance`	Point-wise net variance.
`systematic.effect`	Systematic effect.
`systematic.model.effect`	(Only if `ybest` was specified.) Systematic model effect.
`systematic.estimation.effect`	(Only if `ybest` was specified.) Systematic estimation effect.
`variance.effect`	Variance effect.
`ymain`	Main prediction.
`ybayes`	(Only if `ybayes` or `posterior` was specified.) The optimal prediction.
`size`	Numeric vector of the same length as the number of test observations. The number of predictions made for each test observation.

Domingos, P. (2000). A unified bias-variance decomposition for zero-one and squared loss. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pages 564–569. AAAI Press / The MIT Press.

James, G. M. (2003). Variance and bias for general loss functions. Machine Learning, 51(2) 115–135.

schiffner/biVar documentation built on May 29, 2019, 3:39 p.m.