# skill_confusionMatrix: Confusion Matrix Statistics In MazamaScience/PWFSLSmoke: Utilities for Working with Air Quality Monitoring Data

## Description

Measurements of categorical forecast accuracy have a long history in weather forecasting. The standard approach involves making binary classifications (detected/not-detected) of predicted and observed data and combining them in a binary contingency table known as a confusion matrix.

This function creates a `confusion matrix` from predicted and observed values and calculates a wide range of common statistics including:

• TP (true postive)

• FP (false postive) (type I error)

• FN (false negative) (type II error)

• TN (true negative)

• TPRate (true positive rate) = sensitivity = recall = TP / (TP + FN)

• FPRate (false positive rate) = FP / (FP + TN)

• FNRate (false negative rate) = FN / (TP + FN)

• TNRate (true negative rate) = specificity = TN / (FP + TN)

• accuracy = proportionCorrect = (TP + TN) / total

• errorRate = 1 - accuracy = (FP + FN) / total

• falseAlarmRatio = PPV (positive predictive value) = precision = TP / (TP + FP)

• FDR (false discovery rate) = FP / (TP + FP)

• NPV (negative predictive value) = TN / (TN + FN)

• FOR (false omission rate) = FN / (TN + FN)

• f1_score = (2 * TP) / (2 * TP + FP + FN)

• detectionRate = TP / total

• baseRate = detectionPrevalence = (TP + FN) / total

• probForecastOccurance = prevalence = (TP + FP) / total

• balancedAccuracy = (TPRate + TNRate) / 2

• expectedAccuracy = (((TP + FP) * (TP + FN) / total) + ((FP + TN) * sum(FN + TN) / total )) / total

• heidkeSkill = kappa = (accuracy - expectedAccuracy) / (1 - expectedAccuracy)

• bias = (TP + FP) / (TP + FN)

• hitRate = TP / (TP + FN)

• falseAlarmRate = FP / (FP + TN)

• pierceSkill = ((TP * TN) - (FP * FN)) / ((FP + TN) * (TP + FN))

• criticalSuccess = TP / (TP + FP + FN)

• oddsRatioSkill = yulesQ = ((TP * TN) - (FP * FN)) / ((TP * TN) + (FP * FN))

## Usage

 ```1 2``` ```skill_confusionMatrix(predicted, observed, FPCost = 1, FNCost = 1, lightweight = FALSE) ```

## Arguments

 `predicted` logical vector of predicted values `observed` logical vector of observed values `FPCost` cost associated with false positives (type I error) `FNCost` cost associated with false negatives (type II error) `lightweight` flag specifying creation of a return list without derived metrics

## Value

List containing a table of `confusion matrix` values and a suite of derived metrics.

## References

 ```1 2 3 4``` ```predicted <- sample(c(TRUE,FALSE), 1000, replace=TRUE, prob=c(0.3,0.7)) observed <- sample(c(TRUE,FALSE), 1000, replace=TRUE, prob=c(0.3,0.7)) cm <- skill_confusionMatrix(predicted, observed) print(cm) ```