Description Usage Arguments Details Value See Also Examples
A reasonable question is: what is the accuracy of the model we have created; this function aims to provide information towards answering that question. There are a number of different metrics used depending primarily on whether the model encodes categorical or continuous data—see below for details.
1 |
pred |
the predicted classes. |
valid |
the validation classes. |
weights |
(optional) sampling weights for each point; if provided, the confusion matrix will be the sum of sample weights in each bin rather than the sum of the points—the latter case really signifying where weights are all one. |
classNames |
(optional) a character vector of class names (strings). It is necessary because in grouped models, there are no
meaningful classnames stored internal to the model. classNames will be subsetted to include only the levels that are actually present
in the model. See |
... |
any extra parameters; not currently used. |
The metrics returned for categorical models are:
confMatrix
a named list of confusion matrices for each model in the list. Each confusion matrix is a data.frame with
predicted values as rows and actual values as columns.
userAcc
a data frame in which each column represents the user accuracy for a given model, and rows are the accuracies for
that input class. The final row is the overall user accuracy for that model. User accuracy, the inverse of so-called commission
error, is the the portion of pixels that are what they were predicted to be; that is, the number of correctly identified sites
divided by the number sites that the model predicted to be in that class.
prodAcc
a data frame in which each column represents the producer accuracy for a given model, and rows are the accuracies
for that input class. The final row is the overall producer accuracy for that model. Producer accuracy, the inverse of so-called
omission error, is the percent of pixels that are labelled correctly; that is, the number of correctly identified sites divided by
the number that are actually of that class.
classLevelAcc
a vector of class-level accuracies—another statistic for measuring accuracy as distinct from producer or
user error. It is based on omission and commission errors (see producer and user accuracies), such that N_ommission is the
number of incorrectly classified points in a column of the confusion matrix, and N_commission is the number of
incorrectly classified points in a row of the confusion matrix. Given that, class-level accuracy can be computed as:
Acc_class.level = N_correct / (N_correct + N_ommission + N_commission)
kappa
a vector of kappa values for each model type. The Κ-statistic is a measure of how much better this model
predicts output classes than would be done by chance alone. It is computed as the ratio of the observed accuracy less that expected
by chance, standardized by unity less the probability by chance alone.
K = (Acc_obs - Acc_chance) / (1 - Acc_chance)
The metrics returned for continuous models are:
overallAcc
is the overall r-squared, computed by 1 - SS.residual/SS.total
mse
is the raw mean squared error, computed by mean(SS.residual)/N
Note: this function returns different data depending on the whether the model is categorical or continuous:
categorical
a five element named list: confMatrix
= confusion matrix, userAcc
= user accuracies,
prodAcc
= producer accuracies, classLevelAcc
= class-level accuracies, overallAcc
= overall accuracy,
kappa
=kappa
continuous
a two element names list: overallAcc
= overall accuracy, mse
= mean squared error
generateModels
for creating models; isCat
, isCont
for how categorical/continuous type is evaluated.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | data ('siteData')
modelRun <- generateModels (data = siteData,
modelTypes = suppModels,
x = c('brtns','grnns','wetns','dem','slp','asp','hsd'),
y = 'ecoType',
grouping = ecoGroup[['domSpecies','transform']])
model <- modelRun$randomForest
mAcc <- classAcc (getFitted(model),getData(model)[['ecoType']],
classNames=ecoGroup[['domSpecies','labels']])
str (mAcc,1)
modelRun <- generateModels (data = siteData,
modelTypes = contModels,
x = c('brtns','grnns','wetns','dem','slp','asp','hsd'),
y = 'easting')
model <- modelRun$randomForest
mAcc <- classAcc (getFitted(model),getData(model)[['easting']])
str (mAcc,1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.