simdata: Simulate a dataset based on a HSROC model

Description Usage Arguments Details Value References Examples

Description

This function simulates a dataset based on the HSROC diagnostic meta-analysis model. It allows for the reference standard to be imperfect or perfect.

Usage

1
2
3
simdata(N, n, n.random = "FALSE", sub_rs = NULL, prev, se_ref = NULL,
   sp_ref = NULL, T, range.T = c(-Inf, Inf), L, range.L = c(-Inf, Inf),
   sd_t, sd_a, b, path = getwd()  )

Arguments

N

the number of studies to be included in the meta-analysis.

n

numerical vector, possibly a single value, specifying the number of individuals within each study. See details for further explanations.

n.random

if TRUE, the number of individuals within each study is drawn from n with replacement.

sub_rs

a list that specifies the reference standard used by each study. See details for further explanations.

prev

a vector of length N giving the prevalence in each study.

se_ref

a vector of length equal to the number of reference standards giving the sensitivity for each reference test.

sp_ref

a vector of length equal to the number of reference standards giving the specificity for each reference test.

T

single numeric value, the overall mean cut-off value to define a positive test.

range.T

a vector of length 2 specifiying a range of values for the individual cut-off theta_i. See details for further explanations.

L

single numeric value, the overall difference in mean values (diagnostic accuracy) on the continuous index test result comparing the diseased group and the non-diseased group.

range.L

a vector of length 2 specifiying a range of values for the individual difference in mean values (diagnostic accuracy) on the continuous index test result comparing the diseased group and the non-diseased group alpha_i. See details for further explanations.

sd_t

single numeric value, the between study standard deviation in the cut-off theta_i.

sd_a

single numeric value, the between study standard deviation in the mean value of the index test disease group alpha_i

b

single numeric value, the ratio of the continuous standard deviation of the index test results on patients with the disease compared to patients without the disease.

path

a character string pointing to the directory where the simulated data will be saved to.

Details

The HSROC model uses the following parametrization : S_i = Phi(-(theta_i - alpha_i/2)/exp(beta/2)) and C_i = Phi((theta_i + alpha_i/2)/exp(-beta/2))

If n.random is FALSE, the number of components in n must match the value of N, unless n is equal to a single value. For the latter case, all studies would be assumed to have the same number of individuals, that is n. If n.random is TRUE, the number of elements may not necessarly be equal to the value of N.

The first element of the list-object sub_rs corresponds to the number of different reference standards. The default value is 1. The number of additional elements will depend on the value of the first element. There must be as many additional elements in sub_rs as there are different reference standards. Assuming the studies are labelled 1, ..., N, each of these additional elements must be a vector (possibly of length one) taking as their values the labels of the studies sharing the same reference standard. For example, if we have 2 reference tests, the first one applied over studies 1-10 and the second one applied over studies 11-15 then the sub_rs list-argument should be of length 3 with the following elements : 2, 1:10, 11:15

The range.T argument ensures the individual theta_i will be generated within the range provided. If no range is provided by the user (default) the function assumes no restrictions are made on the possible values of theta_i. The range.L argument ensures the individual alpha_i will be generated within the range provided. If no range is provided by the user (default) the function assumes no restrictions are made on the possible values of theta_i.

For more help on this function, see the tutorial pdf file available on http://www.nandinidendukuri.com/filesonjoomlasite/HSROC_R_Tutorial.pdf

Value

A list of the 2x2 tables for each study, the between-study parameters, the within-study parameters and the reference standard.

Text files are created in the path directory. These files are :

“True_values.txt”, reports the within-study parameters alpha_i, theta_i, sensitivity of test under evaluation ( S1_i ), specificity of test under evaluation ( C1_i ) and prevalence (pi_i) used in the simulation.

“True_values2.txt”, reports the values of the between-study parameters LAMBDA, standard deviation of alpha_i ( sigma_alpha ), THETA, standard deviation of theta_i ( sigma_theta ) and beta used to simulate the data.

“True_REFSTD.txt”, reports the values of the reference standard used to simulate the data.

“True_values_index.txt”, reports the variable names of the 3 files described above.

.

References

Dendukuri, N., Schiller, I., Joseph, L., and Pai, M. (2012) Bayesian meta-analysis of the accuracy of a test for tuberculosis pleuritis in the absence of a gold-standard reference. Biometrics. doi:10.1111/j. 1541-0420.2012.01773.x

Rutter, C. M., and Gatsonis, C. A. (2001) A hierarchical regression approach to meta-analysis of diagnostic accuracy evaluations. Statistics in Medicine, 20(19):2865-2884.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
#EXAMPLE 1
#We want to simulate data for 10 studies based on an HSROC model assuming
#each study uses the same imperfect reference standard.

## Not run: 
N = 10
LAMBDA = 2
sd_alpha = 0.75
THETA = 1.5
sd_theta = 0.5
beta = 0
pi = runif(10,0,1)

REFSTD = list(1, 1:10)  #Only 1 reference standard ...
s2 = c(0.5)	 #Sensitivity of the reference test
c2 = c(0.85) 	 #Specificity of the reference test


sim.data = simdata(N=N, n = c(50,50,60,60,70,70,80,80,90,90),
   sub_rs = REFSTD, prev=pi, se_ref=s2, sp_ref=c2, T=THETA,
   L=LAMBDA, sd_t=sd_theta, sd_a=sd_alpha, b=beta)

## End(Not run)

#EXAMPLE 2
#We want to simulate data for 15 studies based on an HSROC model such that
#the first 5 studies share a common reference standard and the remaining
#10 studies also share a common reference standard.

## Not run: 
N = 15
LAMBDA = 3.6
sd_alpha = 1.15
THETA = 2.3
sd_theta = 0.75
beta = 0.15
pi = runif(15,0.1,0.5)

REFSTD = list(2, 1:5, 6:15)  #Two different reference standards ...
s2 = c(0.40, 0.6)	 #Sensitivity of the reference tests
c2 = c(0.75,0.95) 	 #Specificity of the reference tests

#Thus, for the first 5 studies, S2 = 0.40 and C2 = 0.75 while for the last
#10 studies s2 = 0.6 and c2 = 0.95


sim.data = simdata(N=N, n=seq(30,120,1), n.random=TRUE, sub_rs = REFSTD,
   prev=pi,  se_ref=s2, sp_ref=c2, T=THETA, L=LAMBDA, sd_t=sd_theta,
   sd_a=sd_alpha, b=beta)

## End(Not run)

#EXAMPLE 3
#Assume the same context as the one in EXAMPLE 2 and let's suppose
#that each individual cut-off theta_i should lie between [-5,5]

## Not run: 
N = 15
LAMBDA = 3.6
sd_alpha = 1.15
THETA = 2.3
sd_theta = 0.75
beta = 0.15
pi = runif(15,0.1,0.5)

REFSTD = list(2, 1:5, 6:15)  #Two different reference standards ...
s2 = c(0.40, 0.6)	 #Sensitivity of the reference tests
c2 = c(0.75,0.95) 	 #Specificity of the reference tests

#Thus, for the first 5 studies, S2 = 0.40 and C2 = 0.75 while for the last
#10 studies s2 = 0.6 and c2 = 0.95


sim.data = simdata(N=N, n=seq(30,120,1), n.random=TRUE, sub_rs = REFSTD,
   prev=pi,  se_ref=s2, sp_ref=c2, T=THETA, range.T=c(-5,5),L=LAMBDA,
   sd_t=sd_theta,sd_a=sd_alpha, b=beta)

## End(Not run)

HSROC documentation built on Sept. 19, 2019, 9:05 a.m.

Related to simdata in HSROC...