datasim: Simulate data including multiple outcomes from error-prone...
In icensmis: Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

Description Usage Arguments Details Value Examples

View source: R/datasim.R

This function simulates a data of N subjects with misclassified outcomes, assuming each subject receives a sequence of pre-scheduled tests for disease status ascertainment. Each test is subject to error, characterized by sensitivity and specificity. An exponential distribution is assumed for the time to event of interest. Three kinds of covariate settings can be generated: one sample setting, two group setting, and continuous covariates setting with each covariate sampled from i.i.d. N(0, 1). Two missing mechanisms can be assumed, namely MCAR and NTFP. The MCAR setting assumes that each test is subject to a constant, independent probability of missingness. The NTFP mechanism includes two types of missingness - (1) incorporates a constant, independent, probability of missing for each test prior to the first positive test result; and (2) all test results after first positive are missing. The simulated data is in longitudinal form with one row per test time.

Covariate values, by default, are assumed to be constant. However, this function can simulate a special case of time varying covariates. Under time varying covariates setting, each subject is assumed to have a change time point, which is sampled from the visit times. We assume that each subject has two sets of covariate values. Before his change time point, the covariate values take from the first set, and second set after change time point. Thus, each subject's distribution of survival time is two-piece exponential distribution with different hazard rates.

datasim(
  N,
  blambda,
  testtimes,
  sensitivity,
  specificity,
  betas = NULL,
  twogroup = NULL,
  pmiss = 0,
  pcensor = 0,
  design = "MCAR",
  negpred = 1,
  time.varying = F
)

`N`	total number of subjects to be simulated
`blambda`	baseline hazard rate
`testtimes`	a vector of pre-scheduled test times
`sensitivity`	the sensitivity of test
`specificity`	the specificity of test
`betas`	a vector of regression coefficients of the same length as the covariate vector. If betas = NULL then the simulated dataset corresponds to the one sample setting. If betas != NULL and twogroup != NULL then the simulated dataset corresponds to the two group setting, and the first value of betas is used as the coefficient for the treatment group indicator. If betas != NULL and twogroup = NULL, then the covariates are ~ i.i.d. N(0, 1), and the number of covariates is determined by the length of betas.
`twogroup`	corresponds to the proportion of subjects allocated to the baseline (reference) group in the two-group setting. For the two-group setting, this variable should be between 0 and 1. For the one sample and multiple (>= 2) covariate setting, this variable should be set to NULL. That is, when betas !=NULL, set twogroup to equal the proportion of the subjects in the baseline group to obtain a simulated dataset corresponding to the two-group setting. Else, set twogroup=NULL to obtain either the one sample setting (betas=NULL) or continuous covariates (betas !=NULL).
`pmiss`	a value or a vector (must have same length as testtimes) of the probabilities of each test being randomly missing at each test time. If pmiss is a single value, then each test is assumed to have an identical probability of missingness.
`pcensor`	a value or a vector (must have same length as testtimes) of the interval probabilities of censoring time at each interval, assuming censoring process is independent on other missing mechanisms. If it is the single value, then we assume same interval probabilities as the value. The sum of pcensor (or pcensor * length(testtimes) if it is single value) must be <= 1. For example, if pcensor = c(0.1, 0.2), then it means the the probabilities of censoring time in first and second intervals are 0.1, 0.2, and the probability of not being censored is 0.7.
`design`	missing mechanism: "MCAR" or "NTFP"
`negpred`	baseline negative predictive value, i.e. the probability of being truely disease free for those who were tested (reported) as disease free at baseline. If baseline screening test is perfect, then negpred = 1.
`time.varying`	indicator whether fitting a time varying covariate model or not

To simulate the one sample setting data, set betas to be NULL. To simulate the two group setting data, set twogroup to equal the proportion of the subjects in the baseline group and set betas to equal the coefficient corresponding to the treatment group indicator(i.e. beta equals the log hazard ratio of the two groups). To simulate data with continuous i.i.d. N(0, 1) covariates, set twogroup to be NULL and set betas to equal the vector of coefficients of the covariates.

simulated longitudinal form data frame

## One sample setting  
simdata1 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = NULL, twogroup = NULL, pmiss = 0.3, design = "MCAR")

## Two group setting, and the two groups have same sample sizes
simdata2 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = 0.7, twogroup = 0.5, pmiss = 0.3, design = "MCAR")
  
## Three covariates with coefficients 0.5, 0.8, and 1.0
simdata3 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = c(0.5, 0.8, 1.0), twogroup = NULL, pmiss = 0.3,
  design = "MCAR", negpred = 1)

## NTFP missing mechanism
simdata4 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = c(0.5, 0.8, 1.0), twogroup = NULL, pmiss = 0.3,
  design = "NTFP", negpred = 1)	 

## Baseline misclassification
simdata5 <- datasim(N = 2000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = c(0.5, 0.8, 1.0), twogroup = NULL, pmiss = 0.3, 
  design = "MCAR", negpred = 0.97)  
  
## Time varying covariates
simdata6 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, sensitivity = 0.7,
  specificity = 0.98, betas = c(0.5, 0.8, 1.0), twogroup = NULL, pmiss = 0.3,
  design = "MCAR", negpred = 1, time.varying = TRUE)

icensmis documentation built on Sept. 5, 2021, 5:49 p.m.

icensmis index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

icensmis
Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

datasim: Simulate data including multiple outcomes from error-prone...
In icensmis: Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

Description

Usage

Arguments

Details

Value

Examples

Related to datasim in icensmis...

R Package Documentation

Browse R Packages

We want your feedback!

icensmis Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

datasim: Simulate data including multiple outcomes from error-prone... In icensmis: Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

Description

Usage

Arguments

Details

Value

Examples

Related to datasim in icensmis...

R Package Documentation

Browse R Packages

We want your feedback!

icensmis
Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

datasim: Simulate data including multiple outcomes from error-prone...
In icensmis: Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes