simulate.data2: Simulate data for the cross-validated risc scores design with...

View source: R/simulateData.R

simulate.data2R Documentation

Simulate data for the cross-validated risc scores design with 2 outcomes (CVRS2).

Description

The data is simulated assuming that there are two outcomes (response and response2). Both response and response2 are influenced by a subset of K and T unknown covariates (the sensitive covariates) through the following model:

logit(p.response_i)= mu+lambda*t_i+gamma_1*t_i*x_i1+...+gamma_K*t_i*x_iK, logit(p.response2_i)= mu+lambda*t_i+gamma_1*t_i*x_i1+...+gamma_K*t_i*x_iT,

where p.response_i is the probability of response to treatment for the i-th patient; mu is the intercept; lambda is the treatment main effect that all patients experience regardless of the values of the covariates; t_i is the treatment that the i-th patient receives (t_i = 0 for the control arm and t_i=1 for the treatment arm); x_i1,...,x_iK/x_iT are the values for the K/T unknown sensitive covariates; gamma_1,...,gamma_K/gamma_T are treatment-covariate interaction effects for the K/T covariates. The model assumes that there is a subset of patients (the resp.sensitive group) with a higher probability of response when treated with the new treatment, compared with the control treatment, and a subset of patients (the resp2.sensitive group) with a highter probability of response2 when treated with the new treatment. The model assumes that there are 4 clusters of patients: cluster1 (high probability of both response and response2), cluster2 (low probability of response and hight probability of response2), cluster3 (high probability of response and low probability of response2) and cluster4(low probability of both response and response2). Cluster1 is considered a sensitive group (high probability of both response and response2 when treated with the new treatment, compared with the control treatment.

Usage

## S3 method for class 'data2'
simulate(
  N = 1000,
  L = 100,
  K1 = 10,
  K2 = 10,
  Both = 3,
  rho1 = 0,
  rho2 = 0,
  rho0 = 0,
  mu1 = 1,
  mu2 = 0,
  mu0 = 0,
  sigma1 = 0.5,
  sigma2 = 0.1,
  sigma0 = 0.5,
  rho1.resp2 = 0,
  rho2.resp2 = 0,
  mu1.resp2 = 1,
  mu2.resp2 = 0,
  sigma1.resp2 = 0.5,
  sigma2.resp2 = 0.1,
  mu1.both = 1,
  mu2.both = 0,
  sigma1.both = 0.5,
  sigma2.both = 0.1,
  rho1.both = 0,
  rho2.both = 0,
  perc.sp = 0.1,
  perc.sp2 = 0.1,
  perc.sp.both = 0.1,
  rr.nsp.treat = 0.25,
  rr.con = 0.25,
  rr.sp.treat = 0.8,
  rr2.nsp.treat = 0.25,
  rr2.con = 0.25,
  rr2.sp.treat = 0.8,
  runs = 1,
  seed = 123
)

Arguments

N

Number of patients.

L

Overall number of covariates.

K2

Number of sensitive covariates that influence the response2 only.

Both

Number of overlapping sensitive covariates (influence both the response and response2).

mu1, sigma1, rho1

Mean, sd and correlation for sensitive covariates in sensitive patients.

mu2, sigma2, rho2

Correlation parameter for sensitive covariates in non-sensitive patients.

mu0, sigma0, rho

Correlation parameter for non-sensitive covariates in all patients.

mu1.resp2, sigma1.resp2, rho1.resp2

Mean, sd and correlation for covariates that influence the response2

mu1.both, sigma1.both, rho1.both

Mean, sd and correlation for overlaping covariates in sensitive patients.

mu2.both, sigma2.both, rho2.both

Mean, sd and correlation for overlaping covariates in non-sensitive patients.

perc.sp

Percentage of patients with higher probability of response.

rr.nsp.treat

Response rate on the treatment arm in non-resp.sensitive patients.

rr.con

Response rate on the control arm.

rr.sp.treat

Response rate on the treatment arm in resp.sensitive patients.

rr2.nsp.treat

Probability of response2 for the non-resp2.sensitive patients in the treatment arm

rr2.con

Probability of response2 for the control arm

rr2.sp.treat

Probability of response2 for the resp2.sensitive patients in the treatment arm

runs

Number of replicates to simulate.

seed

A seed for the random number generator.

K

Number of sensitive covariates that influence the response only.

Value

A list of 4 data frames: patients, covar, response, response2

patients: a data frame with one row per patient and the following columns: FID (family ID), IID (individual ID), sens.resp.true (true sensitivity to response), sens.resp2.true (true sensitivity to response2), cluster.true (1 for sens.pred.true ==1 and sens.pred2.true == 1, 2 for sens.pred.true ==0 and sens.pred2.true == 1, 3 for sens.pred.true ==1 and sens.pred2.true == 0 and 4 for sens.pred.true ==0 and sens.pred2.true == 0), sens.true (1 for cluster.true == 1, 0 otherwise). treat (1 for treatment and 0 for control), rr (probability of response), rr2 (probability of response2)

covar: covariate data for L covariates

response: simulated binary variable "response", one column per simulation (number of columns = runs)

response2: simulated binary variable "response2", one column per simulation (number of columns = runs)

Author(s)

Svetlana Cherlin, James Wason

See Also

analyse.simdata2 and cvrs2.plot functions; print and plot methods.

Examples

N = 1000
L = 100
K1 = 10
K2 = 10
Both = 3
rho1 = 0
rho2 = 0
rho0 = 0
mu1 = 1
mu2 = 0
mu0 = 0
sigma1 = 0.5
sigma2 = 0.1
sigma0 = 0.5
rho1.resp2 = 0
rho2.resp2 = 0
mu1.resp2 = 1
mu2.resp2 = 0
sigma1.resp2 = 0.5
sigma2.resp2 = 0.1
mu1.both = 1
mu2.both = 0
sigma1.both = 0.5
sigma2.both = 0.1
rho1.both = 0
rho2.both = 0
perc.sp = 0.1
perc.sp2 = 0.1
perc.sp.both = 0.1
rr.con = 0.25
rr.sp.treat = 0.8
rr2.sp.treat = 0.8
rr.nsp.treat = 0.25
rr2.nsp.treat = 0.25
rr2.con = 0.25
runs = 5
seed = 123
datalist2 = simulate.data2 (N , L , K1, K2, Both, rho1, rho2, rho0, mu1, mu2, mu0, sigma1, sigma2, sigma0, rho1.resp2, rho2.resp2, mu1.resp2, mu2.resp2, sigma1.resp2, sigma2.resp2,mu1.both, mu2.both, sigma1.both, sigma2.both, rho1.both, rho2.both, perc.sp, perc.sp2, perc.sp.both, rr.nsp.treat, rr.con, rr.sp.treat, rr2.nsp.treat, rr2.con, rr2.sp.treat, runs, seed)

svetlanache/rapids documentation built on Sept. 15, 2023, 7 a.m.