nbc | R Documentation |
Performs supervised Naive Bayes Classification on verbal autopsy data.
nbc(train, test, known = TRUE)
train |
Dataframe of verbal autopsy train data (See Data documentation).
Example:
| ||||||||||||||||||||
test |
Dataframe of verbal autopsy test data in the same format as train except if causes are not known:
| ||||||||||||||||||||
known |
TRUE to indicate that the test causes are available in the 2nd column and FALSE to indicate that they are not known |
out The result nbc list object containing:
$prob.causes (vectorof double): the probabilities for each test case prediction by case id
$pred.causes (vectorof char): the predictions for each test case by case id
Additional values:
* indicates that the value is only available if test causes are known
$train (dataframe): the input train data
$train.ids (vectorof char): the ids of the train data
$train.causes (vectorof char): the causes of the train data by case id
$train.samples (double): the number of input train samples
$test (dataframe): the input test data
$test.ids (vectorof char): the ids of the test data
$test.causes* (vectorof char): the causes of the test data by case id
$test.samples (double): the number of input test samples
$test.known (logical): whether the test causes are known
$symptoms (vectorof char): all unique symptoms in order
$causes (vectorof char): all possible unique causes of death
$causes.train (vectorof char): all unique causes of death in the train data
$causes.test* (vectorof char): all unique causes of death in the test data
$causes.pred (vectorof char): all unique causes of death in the predicted cases
$causes.obs* (vectorof char): all unique causes of death in the observed cases
$pred (dataframe): a table of predictions for each test case, sorted by probability
Columns (in order): CaseID, TrueCause, Prediction-1 to Prediction-n..
CaseID (vectorof char): case identifiers
TrueCause* (vectorof char): the observed causes of death
Prediction-n.. (vectorsof char): the predicted causes of death, where Prediction1 is the most probable cause, and Prediction-n is the least probable cause
Example:
CaseID | Prediction1 | Prediction2 |
"a1" | "HIV" | "Stroke" |
"b2" | "Stroke" | "HIV" |
"c3" | "HIV" | "Stroke" |
$obs* (dataframe): a table of observed causes matching $pred for each test case
Columns (in order): CaseID, TrueCause
CaseID (vectorof char): case identifiers
TrueCause (vectorof char): the actual cause of death if applicable
Example:
CaseID | TrueCause |
"a1" | "HIV" |
"b2" | "Stroke" |
"c3" | "HIV" |
$obs.causes* (vectorof char): all observed causes of death by case id
$prob (dataframe): a table of probabilities of each cause for each test case
Columns (in order): CaseID, Cause-1 to Cause-n..
CaseID (vectorof char): case identifiers
Cause-n.. (vectorsof double): probabilies for each cause of death
Example:
CaseID | HIV | Stroke |
"a1" | 0.5 | 0.5 |
"b2" | 0.3 | 0.7 |
"c3" | 0.9 | 0.1 |
Miasnikof P, Giannakeas V, Gomes M, Aleksandrowicz L, Shestopaloff AY, Alam D, Tollman S, Samarikhalaj, Jha P. Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21,000 child and adult deaths. BMC Medicine. 2015;13:286. doi:10.1186/s12916-015-0521-2.
Other main functions:
plot.nbc()
,
print.nbc_summary()
,
summary.nbc()
library(nbc4va) data(nbc4vaData) # Run naive bayes classifier on random train and test data # Set "known" to indicate whether or not "test" causes are known train <- nbc4vaData[1:50, ] test <- nbc4vaData[51:100, ] results <- nbc(train, test, known=TRUE) # Obtain the probabilities and predictions prob <- results$prob.causes pred <- results$pred.causes
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.