diagnosis: Diagnostic test accuracy evaluation
In DiagnosisMed: Diagnostic test accuracy evaluation for health professionals

Description Usage Arguments Details Value References See Also Examples

diagnosis estimate sensitivity, specificity, predictive values, likelihood ratios, area under ROC curve and other validity measures for binary diagnostic test evaluation. It accepts as input either columns from a dataset or vectors, a 2 x 2 table or numbers representing true positives, false negatives, false positives and true negatives. plot for diagnosis draw a simple nomogram or a ROC plot. diagnosis is a list, but has a table that allows the output to be easily exported to a spreadsheet.

diagnosis(tab = NULL, ref = NULL, test = NULL, TN = NULL, FN = NULL,
  FP = NULL, TP = NULL, reference.name = NULL, index.name = NULL,
  CL = 0.95, CL.type = c("wilson", "exact", "approximate"))

## S3 method for class 'diagnosis'
plot(x, type = c("nomogram", "roc"), ...,
  xlab = "Pre-test probability", ylab = "Post-test proabbility",
  lines.arg = list(col = "red", lwd = 3), grid = FALSE, auto.shade = TRUE,
  shade.arg = list(border = par("bg"), col = gray(0.8)), auto.legend = TRUE)

## S3 method for class 'diagnosis'
print(x, digits = 3, ...)

`tab`	A 2 x 2 cross table representing the counts of agreement and disagreement of the reference standard and the index test.
`ref`	The reference standard. A column in a data frame or a vector indicating the classification by the reference test. The reference standard should be coded either as 0 (absence of the condition) or 1 (presence of the condition). If it is formated as character, the function will convert it to a factor, then use the reference factor level as the absence of the condition.
`test`	The index test or test under evaluation. A column in a dataset or vector indicating the test results. The index test should be coded either as 0 (absence of the condition) or 1 (presence of the condition). If it is formated as character, the function will convert it to a factor, then use the reference factor level as the absence of the condition.
`TP, FN, FP, TN`	A number representing True Positives, False Negatives, False Positives, and True Negatives from a 2 x 2 table.
`reference.name, index.name`	The names of the index and reference tests. If one have labels in the dataset, one may pass the labels to these arguments (see example). If one defines `dimnames` of table or matrix, the names of the dimension will override these arguments. (see example) These arguments may be important as other functions that uses `diagnosis` output require these names.
`CL`	Confidence limits for confidence intervals. Must be a numeric value between 0 and 1. Default is 0.95.
`CL.type`	Type of confidence limit. Accepted values are "wilson", "exact", approximate". See `binom.CI`
`x`	For `plot` and `print` functions, `x` is an object assigned with diagnosis output.
`type`	For `plot`, type assigns what plot will be returned. "nomogram" or "roc" are possble values. If `type = "roc"`, don't forget to set the correct `xlab` and `ylab` arguments.
`...`	Other options passed to `print` or `plot.default`.
`xlab, ylab`	Characters indicating the labels of the horizantal and vertical axis. These will be passed to `plot.default`. The default values are `xlab = "Pre-test probability"` and `ylab = "Post-test proabbility"`. But these make sense only if `type = "nomogram"`. If `type = "roc"` one must set by hand `xlab = "1 - Specificity"` and `ylab = "Sensitivity"`
`lines.arg`	A `list` of arguments to be passed to `lines`.
`grid`	Logical. If `TRUE`, it calls the `grid` function with the deafult arguments.
`auto.shade`	Logical. If `TRUE`, it calls the `polygon` function with the arguments in the `shade.arg`. It represents the confidence band of the Positive Likelihood Ratio of the test.
`shade.arg`	A `list` of arguments to be passed to A `polygon`. See `auto.shade`.
`auto.legend`	Logical. If `TRUE`, it makes a legend of the graph.
`digits`	The number of decimals that will be passed to `print`.

Sensitivity, Specificity, Predictive values and Accuracy confidence limits rely on binomial distribution, which does not give result outside [0:1] such as normal distribution or asymptotic theory. DOR, Likelihoodratios and Youden J index confidence limits rely on normal approximation (Wald method for likelihoods). The AUC (area under the ROC curve) is estimated by trapezoidal method (see below). See example to check how results can be exported to a document or to a spreadsheet. If one, decides to input a table, the expected format is as follows:

	TN	FN
	FP	TP

plot.diag will draw a very simple nomogram as many examples from wikipedia http://en.wikipedia.org/wiki/Nomogram. This is not a generic nomogram as shown in many evidenced based medicine texts, because this one shows only pre-test and post-test variations with a fixed positive likelihood ratio estimated from the data. This likelihood is a statistic from an object created by diagnosis function. Its usage is the same as applying the Bayes theorem where the pre-test odds times positive likelihood ratio equals the pos-test odd (transforming the odds to probabilities). To use it, draw, with a rule, a vertical line from a desired pre-test probability, and to find the corresponding post-test probability, draw a horizontal line from the intersection of the curve and the vertical line toward the vertical axis.

A 2 x 2 table from which the validity measures are calculated.

Sample size. The number of subjects analyzed.
Prevalence. The proportion classified as with the target condition by the reference standard.
Sensitivity. The probability of the test to correctly classify subjects with the target condition (TP/(TP+FN)).
Specificity. The probability of the test to correctly classify subjects without the target condition (TN/(TN+FP)).
Predictive values. The probabilities of being with (positive predictive value) (TP/(TP+FP)) or without (negative predictive value) the target condition given a test result (TN/(TN+FN)).
Likelihood ratios. The probability of test a result in people with the target condition, divided by the probability of the same test result in people without the target condition (PLR = Se/(1-Sp); NLR = (1-Sp)/Se).
Diagnostic odds ratio. Represents the overall discrimination of a dichotomous test, and is equivalent to the ratio of PLR and NLR.
Error rate. Expresses how many errors we make when we diagnose patients with an abnormal test result as diseased, and those with a normal test result as non-diseased ((FP+FN)/sample size).
Youden J index. This is an overall accuracy measure. It ranges from -1 to 1, the closest to one better the test is. Se + Sp -1.
Accuracy. Overall measure that express the capacity of the test to correctly classify subjects with and without the target condition ((TP+TN)/(sample size)).
Area under ROC curve. Overall measure of accuracy - here the method is the trapezoidal. It gives identical results as (Se+SP)/2.

Knotterus. The Evidence Based Clinical Diagnosis; BMJBooks, 2002.

Xiou-Hua Zhou, Nancy A Obuchowsky, Donna McClish. Statistical Mehods in diagnostic Medicine; Wiley, 2002.

Simel D, Samsa G, Matchar D (1991). Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. Journal of Clinical Epidemiology 44: 763 - 770

LRgraph, binom.CI

# Simulating a dataset
mydata <- as.data.frame(rbind(
cbind(rep(c("positive"),18),rep(c("negative"),18)),
  cbind(rep(c("positive"),72),rep(c("positive"),72)),
  cbind(rep(c("negative"),25),rep(c("positive"),25)),
  cbind(rep(c("negative"),149),rep(c("negative"),149))
))
colnames(mydata) <- c('culture','serology')

# Setting labels to the dataset
attr(mydata, "var.labels") <- c("Automatic culture","ELISA test")

# A little description of the data set to check if it is ok!
str(mydata)

# Running the diagnosis analysis
diagnosis(ref = mydata$culture, test = mydata$serology)

# Same thing passing the labels
diagnosis(ref = mydata$culture, test = mydata$serology,
reference.name = attr(mydata, "var.labels")[1],
index.name = attr(mydata, "var.labels")[2])

#Simulating a table
mytable <- matrix(c(149,18,25,72), nrow = 2, ncol = 2, byrow = TRUE,
                  dimnames = list(Serology = c('absent','present'),
                                  Citology = c('absent','present')))

# Running analysis from a 2 x 2 table
# The names of the table dimensions overrides the index.name and reference.name
diagnosis(tab = mytable)

# Inserting values as isolated numbers
diagnosis(TP = 72, FN = 18, FP = 25, TN = 149,
          index.name = "Serology", reference.name = "Citology")

#---------------------------------
# Export results to a spreadsheet:
#---------------------------------

# Assigning diagnosis to an object
mytest <- diagnosis(TP = 364, FN = 22, FP = 17, TN = 211,
index.name = "Gram", reference.name = "Culture")

# Export to a spreadsheet using csv format
# write.csv(mytest$results, 'MytestResults.csv', quote = FALSE, na = '')
# OR to a doc document with rtf library
# library(rtf)
# rtf1 <- RTF("MytestResults.doc")
# addParagraph(rtf1, "Table 1 - Diagnostic test accuracy.")
# addNewLine(rtf1)
# addTable(rtf1, mytest$results, col.justify = c("L","C","C","C"),
#          header.col.justify = c("L","C","C","C), row.names = TRUE)
# done(rtf1)

# Draw a ROC plot
# WARNING: the axis labels must be set by hand.
plot(mytest, type = "roc", grid = TRUE, xlab = "1 - Specificity", ylab = "Sensitivy")

# Draw a nomogram from a test
plot(mytest, type = "nomogram", grid = TRUE, auto.shade = FALSE)
plot(mytest, type = "nomogram", grid = FALSE, auto.shade = TRUE)
plot(mytest, type = "nomogram", grid = TRUE, auto.shade = TRUE)

rm(mydata, mytable, mytest)