Description Usage Arguments Details Value Author(s) References See Also Examples
Using predictions of given model produced by predict.CoreModel
and correct labels,
computes some statistics evaluating the quality of the model.
1 2 3 
model 
The model structure as returned by 
correctClass 
A vector of correct class labels for classification problem and function values for regression problem. 
predictedClass 
A vector of predicted class labels for classification problem and function values for regression problem. 
predictedProb 
An optional matrix of predicted class probabilities for classification. 
costMatrix 
Optional cost matrix can provide nonuniform costs for classification problems. 
priorClProb 
If 
avgTrainPrediction 
If 
beta 
For two class problems 
The function uses the model
structure as returned by CoreModel
,
predictedClass
and predictedProb
returned by
predict.CoreModel
. Predicted values are compared with true values
and some statistics are computed measuring the quality of predictions.
In classification only one of the predictedClass
and predictedProb
can be NULL
(one of them is computed from the other under assumption that class label is assigned to the most probable class).
Some of the returned statistics are defined only for two class problems, for which the
confusion matrix specifying the number of instances of true/predicted class is
defined as follows,
true/predicted class  positive  negative 
positive  true positive (TP)  false negative (FN) 
negative  false positive (FP)  true negative (TN) 
Optional cost matrix can provide nonuniform costs for classification problems. For regression
problem this parameter is ignored. The costs can be different from the ones used for building the model
in CoreModel
and prediction with the model in predict.CoreModel
.
If no costs are supplied, uniform costs are assumed.
The format of the matrix is costMatrix(true_class, predicted_class)
.
By default a uniform costs are assumed, i.e., costMatrix(i, i) = 0
, and costMatrix(i, j) = 1
,
for i
not equal to j
. See the example below.
If a nonCORElearn model is evaluated, one should set model=NULL
, and a vector of prior of class
probabilities priorClProb
shall be provided in case of classification,
and in case of regression avgTrainPrediction
shall be the mean of prediction values
(estimated on a e.g., training set).
For classification problem function returns list with the components
accuracy 
classification accuracy, for two class problems this would equal accuracy= (TP+TN) / (TP+FN+FP+TN) 
averageCost 
average classification cost 
informationScore 
information score statistics measuring information contents in the predicted probabilities 
AUC 
Area under the ROC curve 
predictionMatrix 
matrix of missclassifications also confusion matrix 
sensitivity 
sensitivity for two class problems (also called accuracy of the positive class, i.e., acc+, or true positive rate), sensitivity=TP/(TP+FN) 
specificity 
specificity for two class problems (also called accuracy of the negative class, i.e., acc, or true negative rate), specificity=TN/(TN+FP) 
brierScore 
Brier score of predicted probabilities (the original Brier's definition which scores all the classes not only the correct one) 
kappa 
Cohen's kappa statistics measuring randomness of the predictions; for perfect predictions kappa=1, for completely random predictions kappa=0 
precision 
precision for two class problems precision=TP/(TP+FP) 
recall 
recall for two class problems (the same as sensitivity) 
Fmeasure 
Fmeasure giving a weighted score of precision and recall for two class problems F = (1+beta^2)*recall*precision / (beta^2 * recall + precision) 
Gmean 
geometric mean of positive and negative accuracy, G=sqrt(sensitivity * specificity) 
KS 
KolmogorovSmirnov statistics defined for binary classification problems, reports the distance between the probability distributions of positive class for positive and negative instances, see (Hand, 2005), value 0 means no separation, and value 1 means perfect separation, KS = max_t TPR(t)FPR(t), see definitions of TPR and FPR below 
TPR 
true positive rate TPR = TP / (TP+FN) at maximal value of 
FPR 
false positive rate FPR = FP / (FP+TN) at maximal value of 
For regression problem the returned list has components
MSE 
square root of Mean Squared Error 
RMSE 
Relative Mean Squared Error 
MAE 
Mean Absolute Error 
RMAE 
Relative Mean Absolute Error 
Marko RobnikSikonja
Igor Kononenko, Matjaz Kukar: Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood, 2007
David J.Hand: Good practice in retail credit scorecard assesment. Journal of Operational Research Society, 56:11091117, 2005)
CORElearn
,
CoreModel
,
predict.CoreModel
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  # use iris data
# build random forests model with certain parameters
model < CoreModel(Species ~ ., iris, model="rf",
selectionEstimator="MDL",minNodeWeightRF=5,
rfNoTrees=100, maxThreads=1)
# prediction with node distribution
pred < predict(model, iris, rfPredictClass=FALSE)
# Model evaluation
mEval < modelEval(model, iris[["Species"]], pred$class, pred$prob)
print(mEval)
# use nonuniform cost matrix
noClasses < length(levels(iris[["Species"]]))
costMatrix < 1  diag(noClasses)
costMatrix[3,1] < costMatrix[3,2] < 5 # assume class 3 is more valuable
mEvalCost < modelEval(model, iris[["Species"]], pred$class, pred$prob,
costMatrix=costMatrix)
print(mEvalCost)
destroyModels(model) # clean up

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.