codeVA: Running automated method on VA data

View source: R/VAmain.r

codeVAR Documentation

Running automated method on VA data

Description

Running automated method on VA data

Usage

codeVA(
  data,
  data.type = c("WHO2012", "WHO2016", "PHMRC", "customize")[2],
  data.train = NULL,
  causes.train = NULL,
  causes.table = NULL,
  model = c("InSilicoVA", "InterVA", "Tariff", "NBC")[1],
  Nchain = 1,
  Nsim = 10000,
  version = c("4.02", "4.03", "5")[2],
  HIV = "h",
  Malaria = "h",
  phmrc.type = c("adult", "child", "neonate")[1],
  convert.type = c("quantile", "fixed", "empirical")[1],
  ...
)

Arguments

data

Input VA data, see data.type below for more information about the format.

data.type

There are four data input types currently supported by codeVA function as below.

  • WHO2012: InterVA-4 input format using WHO 2012 questionnaire. For example see data(RandomVA1). The first column should be death ID.

  • WHO2016: InterVA-5 input format using WHO 2016 questionnaire. For example see data(RandomVA5). The first column should be death ID.

  • PHMRC: PHMRC data format. The raw PHMRC long format data will be processed internally following the steps described in McComirck et al. (2016). For example see ConvertData.phmrc

  • customized: Any dichotomized dataset with “Y“ denote “presence”, “” denote “absence”, and “.” denote “missing”. The first column should be death ID.

data.train

Training data with the same columns as data, except for an additional column specifying cause-of-death label. It is not used if data.type is “WHO” and model is “InterVA” or “InSilicoVA”. The first column also has to be death ID for “WHO” and “customized” types.

causes.train

the column name of the cause-of-death assignment label in training data.

causes.table

list of causes to consider in the training data. Default to be NULL, which uses all the causes present in the training data.

model

Currently supports four models: “InSilicoVA”, “InterVA”, “Tariff”, and “NBC”.

Nchain

Parameter specific to “InSilicoVA” model. Currently not used.

Nsim

Parameter specific to “InSilicoVA” model. Number of iterations to run the sampler.

version

Parameter specific to “InterVA” model. Currently supports “4.02”, “4.03”, and “5”. For InterVA-4, “4.03” is strongly recommended as it fixes several major bugs in “4.02” version. “4.02” is only included for backward compatibility. “5” version implements the InterVA-5 model, which requires different data input format.

HIV

Parameter specific to “InterVA” model. HIV prevalence level, can take values “h” (high), “l” (low), and “v” (very low).

Malaria

HIV Parameter specific to “InterVA” model. Malaria prevalence level, can take values “h” (high), “l” (low), and “v” (very low).

phmrc.type

Which PHMRC data format is used. Currently supports only “adult” and “child”, “neonate” will be supported in the next release.

convert.type

type of data conversion when calculating conditional probability (probability of each symptom given each cause of death) for InterVA and InSilicoVA models. Both “quantile” and “fixed” usually give similar results empirically.

  • quantile: the rankings of the P(S|C) are obtained by matching the same quantile distributions in the default InterVA P(S|C)

  • fixed: P(S|C) are matched to the closest values in the default InterVA P(S|C) table.

  • empirical: no ranking is calculated, but use the empirical conditional probabilities directly, which will force updateCondProb to be FALSE for InSilicoVA algorithm.

...

other arguments passed to insilico, InterVA, interVA_train, tariff, and nbc function in the nbc4va package. See respective package documents for details.

Value

a fitted object

References

Tyler H. McCormick, Zehang R. Li, Clara Calvert, Amelia C. Crampin, Kathleen Kahn and Samuel J. Clark (2016) Probabilistic cause-of-death assignment using verbal autopsies. https://arxiv.org/abs/1411.3042, Journal of the American Statistical Association

James, S. L., Flaxman, A. D., Murray, C. J., & Population Health Metrics Research Consortium. (2011). Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies. Population Health Metrics, 9(1), 1-16.

Zehang R. Li, Tyler H. McCormick, Samuel J. Clark (2014) InterVA4: An R package to analyze verbal autopsy data. Center for Statistics and the Social Sciences Working Paper, No.146

http://www.interva.net/

Miasnikof P, Giannakeas V, Gomes M, Aleksandrowicz L, Shestopaloff AY, Alam D, Tollman S, Samarikhalaj, Jha P. Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21,000 child and adult deaths. BMC Medicine. 2015;13:286.

See Also

insilico in package InSilicoVA, InterVA in package InterVA4, InterVA5 in package InterVA5, interVA_train, tariff in package Tariff, and nbc function in package nbc4va.

Examples


data(RandomVA3)
test <- RandomVA3[1:200, ]
train <- RandomVA3[201:400, ]
fit1 <- codeVA(data = test, data.type = "customize", model = "InSilicoVA",
                    data.train = train, causes.train = "cause",
                    Nsim=1000, auto.length = FALSE)

fit2 <- codeVA(data = test, data.type = "customize", model = "InterVA",
               data.train = train, causes.train = "cause", write=FALSE,
               version = "4.02", HIV = "h", Malaria = "l")

fit3 <- codeVA(data = test, data.type = "customize", model = "Tariff",
               data.train = train, causes.train = "cause", 
               nboot.sig = 100)




openVA documentation built on May 29, 2024, 6:04 a.m.