simtdc: Simulate survival data for time-dependent covariates

View source: R/pcoxsim.R

simtdcR Documentation

Simulate survival data for time-dependent covariates

Description

This function uses the permutation algorithm to generate a dataset based on user specified list of covariates, of which some can be time-dependent. User can also specify distribution of event and censoring times.

Usage

simtdc(
  nSubjects = 100,
  maxTime = 365,
  pfixed = 2,
  ptdc = 2,
  pbin = NULL,
  betas = NULL,
  tdcmat = NULL,
  xmat = NULL,
  rho = 0,
  eventRandom = NULL,
  rate = 0.012,
  censorRandom = NULL,
  groupByD = FALSE,
  x = FALSE
)

genX(
  nSubjects = 100,
  maxTime = 365,
  pfixed = 2,
  ptdc = 2,
  pbin = NULL,
  tdcmat = NULL,
  rho = 1
)

Arguments

nSubjects

number of subjects to simulate, default is 100.

maxTime

a non-zero integer specifying the maximum length of follow-up time, default is 365.

pfixed

number of time-independent (fixed) covariates. They are randomly drawn from a normal distribution with mean = 0 and sd = 1. The values are replicated for each subject upto the respective follow-up time.

ptdc

number of time-dependent covariates. By default, these are drawn from a normal distribution with mean = 0 and sd = 1 but the user can by-pass this option and specify a matrix of time-dependent covariates via tdcmat.

pbin

optional. Number of binary covariates. This are treated as fixed covariates and are drawn form a binomial distribution with p = 0.5.

betas

a vector of 'true' effect sizes (regression coefficients) representing the magnitude of the relationship between the respective covariate and the risk of event. If NULL, the algorithm generates betas from a uniform distribution and then converts them to log hazard, i.e., log(runif((pfixed+ptdc+pbin), 0, 2)). The length of betas must be the same the total of covariates to generate.

tdcmat

specify own time-dependent covariates. If specified (a matrix with nSubjects*maxTime rows), ptdc is ignored. This is important in mechanistic simulation of the time-dependent covariates.

xmat

specify an entire matrix for all the covariates. If specified (a matrix with nSubjects*maxTime rows), all the previous specifications for number of covariates and tdcmat are ignored. This is important in mechanistic simulation of all the covariates or some specific distributional assumptions are required.

rho

specify the pairwise correlation between the time-independent covariates. The default rho = 0 means no pairwise correlation between the covariates.

eventRandom

a non-negative integers of length nSubjects which represent the subject's event times or a random generating function with n option specified. If NULL, the algorithm generates nSubjects random deviates from exponential distribution with rate = rate. See rate option.

rate

the rate for the exponential random deviates for eventRandom.

censorRandom

a non-negative integers of length nSubjects which represent the subject's censoring times or a random generating function with n option specified. If NULL, the algorithm generates nSubjects random numbers based on uniform distribution, i.e., runif(nSubjects, 1, maxTime).

groupByD

see permalgorithm.

x

logical. Whether to return matrix of generated covariates in addition to the entire dataset.

Details

This function is a wrapper to the permutation algorithm implemented in permalgorithm. The user can fix the positive pairwise correlation between each pair of time-independent covariates by specify 0 < rho <= 1.

Value

a list of dataset, betas (and matrix of covariates). The covariates have a suffix depending on their type, xbin* for binary, xtf* for time-independent (fixed) and xtd* for time-dependent covariates.

  • datasimulated data.frame with the following columns

    • Idsubject id. Identifies each of the nSubjects individuals

    • Eventevent indicator. Event = 1 if the event occurs otherwise 0.

    • Fupindividual max follow-up time

    • Startstart of each time interval

    • Stopend of each time interval

    • x*all generated covariates

  • betasnamed vector of coefficients specified in the function call. Otherwise, internally generated.

  • xmatif x = TRUE, matrix of covariates

References

Sylvestre M.-P., Abrahamowicz M. (2008) Comparison of algorithms to generate event times conditional on time-dependent covariates. Statistics in Medicine 27(14):2618–34

See Also

permalgorithm.

Examples


## Not run: 
	library(PermAlgo)
	library(survival)
	library(ggplot2)
	pcoxtheme()

	set.seed(123407)
	# Simulate with default values
	df <- simtdc()
	head(df$data)
	# Simulate for a number of times to check stability of the estimates
	nrep <- 500
	betas <- log(runif(6, 0, 2))
	beta_list <- list()
	true_list <- list()
	for (i in 1:nrep){
		sim <- simtdc(pfixed = 3, ptdc = 2, pbin = 1, betas = betas)
		df <- sim$data
		vnames <- colnames(df)[!colnames(df) %in% c("Id", "Fup")]
		df <- df[ ,vnames]
		# Estimate coefficients using coxph
		mod <- coxph(Surv(Start, Stop, Event) ~ ., df)
		beta_list[[i]] <- coef(mod)
		true_list[[i]] <- sim$betas
	}
	beta_df <- data.frame(do.call("rbind", beta_list))
	beta_df <- stack(beta_df)

	true_df <- data.frame(ind = names(true_list[[1]]), values = true_list[[1]])
	p1 <- (ggplot(beta_df, aes(x = values))
		+ geom_histogram(alpha = 0.3)
		+ geom_vline(data = true_df, aes(xintercept = values), col = "blue")
		+ facet_wrap(~ind, scales = "free")
		+ labs(x = "Beta estimate", y = "")
	)
	print(p1)

## End(Not run)


pcoxtime documentation built on May 13, 2022, 1:05 a.m.

Related to simtdc in pcoxtime...