simulate_nplcm: Simulate data from nested partially-latent class model...

Description Usage Arguments Details Value See Also Examples

View source: R/simulate-nplcm.R

Description

Simulate data from nested partially-latent class model (npLCM) family

Usage

1
simulate_nplcm(set_parameter)

Arguments

set_parameter

True model parameters in a npLCM specification. It is a list comprised of the following elements:

  • cause_list a vector of disease classes names among cases (since the causes could be multi-pathogen, so its length could be longer than the total number of unique pathogens)

  • etiology a vector of proportions that sum to one

  • pathogen_BrS a vector of pathogen names measured in bronze-standard data. This current function only simulates one slice defined by specimentestpathogen

  • pathogen_SS a vector of pathogen names measured in silver-standard data.

  • meas_nm a list of specimentest names e.g., list(MBS = c("NPPCR"),MSS="BCX") for nasalpharyngeal specimen tested by polymerase chain reaction and blood tested by culture (Cx)

  • Lambda subclass weights ν_1, ν_2, …, ν_K among controls; a vector of K probabilities that sum to 1.

  • Eta a matrix of dimension length(cause_list) by K; each row are subclass weights η_1, η_2, …, η_K for each disease class, so needs to sum to one. In Wu et al 2016, the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct probabilities - it is a matter whether we can recover these parameters (possible when we randomly observe some cases' true disease classes)

  • PsiBS/PsiSS False positive rates Ψ for Bronze-Standard data and for Silver-Standard data. Dimension is J by K. PsiSS is supposed to be 0 vector (by perfect specificity in silver-standard measures).

  • ThetaBS/ThetaSS true positive rates Θ for Bronze-Standard data and for Silver-Standard data. Dimension is J by K (can contain NA if the total number of pathogens is more than the measured pathogens in SS).

  • Nu the number of controls

  • Nd the number of cases

Details

Use different case and control subclass mixing weights. Eta is of dimension J times K. NB: document the elements in set_parameter. Also, current function is written in a way to facilitate adding more measurement components.

Value

A list of measurements, true latent statues:

See Also

simulate_latent for simulating discrete latent status, given which simulate_brs simulates bronze-standard data.

Other simulation functions: simulate_brs, simulate_latent, simulate_ss

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
K.true  <- 2   # no. of latent subclasses in actual simulation. 
               # If eta = c(1,0), effectively, it is K.true=1.
J       <- 21   # no. of pathogens.
N       <- 600 # no. of cases/controls.

eta <- c(1,0) 
# if it is c(1,0),then it is conditional independence model, and
# only the first column of parameters in PsiBS, ThetaBS matter!

seed_start <- 20150202
print(eta)

# set fixed simulation sequence:
set.seed(seed_start)

ThetaBS_withNA <- c(.75,rep(c(.75,.75,.75,NA),5))
PsiBS_withNA <- c(.15,rep(c(.05,.05,.05,NA),5))

ThetaSS_withNA <- c(NA,rep(c(0.15,NA,0.15,0.15),5))
PsiSS_withNA <- c(NA,rep(c(0,NA,0,0),5))

# the following paramter names are set using names in the 'baker' package:
set_parameter <- list(
  cause_list      = c(LETTERS[1:J]),
  etiology        = c(c(0.36,0.1,0.1,0.1,0.1,0.05,0.05,0.05,
                 0.05,0.01,0.01,0.01,0.01),rep(0.00,8)), 
                 #same length as cause_list.
  pathogen_BrS    = LETTERS[1:J][!is.na(ThetaBS_withNA)],
  pathogen_SS     = LETTERS[1:J][!is.na(ThetaSS_withNA)],
  meas_nm         = list(MBS = c("MBS1"),MSS="MSS1"),
  Lambda          = eta, #ctrl mix
  Eta             = t(replicate(J,eta)), #case mix, row number equal to Jcause.
  PsiBS           = cbind(PsiBS_withNA[!is.na(PsiBS_withNA)],
                          rep(0,sum(!is.na(PsiBS_withNA)))),
  ThetaBS         = cbind(ThetaBS_withNA[!is.na(ThetaBS_withNA)],
                          rep(0,sum(!is.na(ThetaBS_withNA)))),
  PsiSS           = PsiSS_withNA[!is.na(PsiSS_withNA)],
  ThetaSS         = ThetaSS_withNA[!is.na(ThetaSS_withNA)],
  Nu      =     N, # control size.
  Nd      =     N  # case size.
)
 simu_out <- simulate_nplcm(set_parameter)
 data_nplcm <- simu_out$data_nplcm
 
 pathogen_display <- rev(set_parameter$pathogen_BrS)
 plot_logORmat(data_nplcm,pathogen_display)

oslerinhealth-releases/baker documentation built on Nov. 4, 2019, 11:11 p.m.