calibratedva: Performs Gibbs sampling for calibration

View source: R/calibration.R

calibratedvaR Documentation

Performs Gibbs sampling for calibration

Description

Takes in estimated causes (or cause probabilities) for both a representative set of deaths without labels, and an unrepresentative set of deaths with labels, and estimates the calibrated CSMF

Usage

calibratedva(
  va_unlabeled,
  va_labeled = NULL,
  gold_standard = NULL,
  causes,
  method = c("mshrink", "pshrink"),
  nchains = 3,
  ndraws = 10000,
  burnin = 1000,
  thin = 1,
  pseudo_samplesize = 100,
  alpha = 5,
  beta = 0.5,
  lambda = 1,
  delta = 1,
  epsilon = 0.001,
  tau = 0.5,
  which.multimodal = "all",
  which.rhat = "all",
  print.chains = FALSE,
  init.seed = 123
)

Arguments

va_unlabeled

When using cause of death predictions from a single algorithm, this will be a matrix, where each row gives the predicted cause of death probabilities for an individual death, for individuals without cause of death labels. Each column represents a cause. If using the top cause, one entry in each row should be 1, while the rest should be 0. When using predictions from multiple algorithms for the ensemble approach, this should be a list of matrices with algorithm predictions for the same individuals, where each entry in the list are predictions from a given algorithm. See examples for more information

va_labeled

A matrix or list in the same format as va_unlabeled, but for individuals with labeled causes of death. If there are no individuals with labeled causes, leave as NULL

gold_standard

A matrix where each row represents either the true cause for an individual with a labeled cause of death (i.e. if the label for individual i is cause j, then gold_standard[i,j] will be 1, and the other entries of that row will be 0), or the probabilities that each individual died of a certain cause. The rows of G_L should correspond to the rows of A_L (or the rows of each element of A_L if it is a list)

causes

A character vector with the names of the causes. These should correspond to the columns of A_U, A_L, and G_L

method

One of either "mshrink" (default) for M-shrinkage or "pshrink" for p-shrinkage

nchains

The number of chains. Default is 3

ndraws

Number of draws in each chain. Default is 10,000

burnin

Number of burnin samples. Default is 1,000

thin

Thinning parameter. Default is no thinning

pseudo_samplesize

The number of pseudo samples (T) used for the Gibbs Sampler using rounding and coarsening. Default is 100.

alpha

A numeric value for the alpha in the prior of gamma when using M-shrinkage. Higher values (relative to beta) leads to more shrinkage. Default is 5. If using the ensemble model, a vector of length K can be used (where K is the number of algorithms).

beta

A numeric value for the beta in the prior of gamma when using M-shrinkage. Default is .5.

lambda

A numeric value for the lambda in the prior of p for p-shrinkage. Higher values leads to more shrinkage. Default is 1. #' @param delta A numeric value for the delta in the prior of p. Only used for M-shrinkage sampling.

epsilon

A numeric value for the epsilon in the prior of M. Default is .001.

which.multimodal

A character specifying whether both p and M (which.multimodal = "all") should be evaluated for multimodality, or just p (which.multimodal = "p")

which.rhat

A character specifying whether both p and M (which.rhat = "all") should be evaluated for convergence, or just p (which.rhat = "p")

print.chains

A logical scalar which says whether or not you want the progress of the sampling printed to the screen. Default is FALSE

init.seed

The initial seed for sampling. Default is 123.

tau.vec

A numeric vector for the log standard deviation for the sampling distributions of the gammas. Only used for M-shrinkage sampling.

Value

A list with the following components.

samples

A mcmc.list object containing the posterior samples for p, M, and gamma (if using M-shrinkage)

A_U

The value of va_unlabeled using for the posterior samples

A_L

The value of va_labeled using for the posterior samples

G_L

The value of gold_standard using for the posterior samples

method

The method used for shrinkage (either mshrink or pshrink)

waic

The estimated WAIC for the calibrated posterior

waic_uncalib

The estimated WAIC for the uncalibrated posterior

multimodal

Either TRUE or FALSE, indicating whether or not the posterior samples for p (and potentially M) are multimodal

rhat_max

The maximum rhat for p (and potentially M), which can be used for evaluating convergence

alpha

The value(s) of alpha used (if method = "mshrink")

beta

The value of beta used (if method = "mshrink")

lambda

The value of lambda used (if method = "pshrink")


jfiksel/CalibratedVA documentation built on Nov. 14, 2022, 2:59 p.m.