This version of the permutational algorithm generates a dataset in which event and censoring times are conditional on an user-specified list of covariates, some or all of which are time-dependent. Event times and censoring times also follow user-specified distributions.

1 2 | ```
permalgorithm(numSubjects, maxTime, Xmat, XmatNames = NULL,
eventRandom = NULL, censorRandom = NULL, betas, groupByD = FALSE)
``` |

`numSubjects` |
is the number of subjects generated. |

`maxTime` |
is a non-zero integer represening the maximum length of follow-up. |

`Xmat` |
is the matrix of covariates values in a counting process
format where every line represent one and only one time interval,
during which all covariate values for a given subject remains
constant. Consequently, |

`XmatNames` |
a an optional vector of character strings representing
the names of each of the covariates in |

`eventRandom` |
represents individual event times. |

`censorRandom` |
represents individual censoring
times. |

`betas` |
is a vector of regression coefficients (log hazard) that represent the
magnitude of the relationship between each of the covariates and the
risk of an event. The length of |

`groupByD` |
groupByD is an option that, when enabled, increases the
computational efficiency of the algorithm by replacing the individual
assignment of event times and censoring times by grouped
assignements. The side effect of this option is that it generates
datasets that are, on average, slightly less consistent with the model
described by |

The gist of the algorithm is to perform a one-to-one matching of
`n`

observed times with independently generated vectors of
covariates values. The matching is performed based on a permutation
probability law derived from the partial likelihood of Cox's
Proportional Hazards (PH) model.

The number of events obtained in the data.frame returned by the function
depends on both the distribution of event `enventRandom`

and
censoring times `censorRandom`

. In the simplest case where the
distirbution of `eventRandom`

is Uniform over follow-up U[1,m], and
the censoring is random, the number of observed events in the data.frame
returnd by the algorithm is determined by the upper bound of the Uniform
distribution of `censorRandom`

. For example, setting the
distribution of `censorRandom`

to U[1,m] will lead to approximately
half of the subjects to experience an event during follow-up, while
setting the distribution of `censorRandom`

to U[1,3/2] will lead to
approximately two thirds of the observed times to be events.

Subjects without an event before or on `maxTime`

and who are not
censored before `maxTime`

are censored on `maxTime`

(administrative censoring).

*** Warning *** Currently the algorithm only takes Xmat in matrix format. Consequently, factor variables are not allowed. Instead, users need to code them with binary indicators.

A data.frame object with columns corresponding to

`Id` |
Identifies the rows of the data.frame that corresponds to
each of the |

`Event` |
Indicator of event. |

`Fup` |
Individual follow-up time. |

`Start` |
For counting process formulation. Represents the start of each time interval. |

`Stop` |
For counting process formulation. Represents the end of each time interval. |

`Xmat` |
The values of the covariates specified in Xmat. |

Marie-Pierre Sylvestre, Thad Evans, Todd MacKenzie, Michal Abrahamowicz

This algorithm is an extension of the permutational algorithm first introduced by Abrahamowicz, MacKenzie and Esdaile, and described in details by MacKenzie and Abrahamowicz. The current version of the permutational algorithm is a flexible tool to generate event and censoring times that follow user-specified distributions and that are conditional on user-specified covariates. This is especially useful whenever at least one of the covariate is time-dependent so that conventional inversion methods are difficult to implement.

The algorithm has been validated through simulations in Sylvestre and Abrahamowicz. Please reference the manuscript by Sylvestre and Abrahamowicz, cited below, if this program is used in any published material.

Sylvestre M.-P., Abrahamowicz M. (2008) Comparison of algorithms to generate
event times conditional on time-dependent covariates. *Statistics in
Medicine* **27(14)**:2618–34.

Abrahamowicz M., MacKenzie T., Esdaile J.M. (1996) Time-dependent hazard ratio:
modelling and hypothesis testing with application in lupus nephritis.
*JASA* **91**:1432–9.

MacKenzie T., Abrahamowicz M. (2002) Marginal and hazard ratio specific random
data generation: Applications to semi-parametric bootstrapping.
*Statistics and Computing* **12(3)**:245–252.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | ```
# Example 1 - Generating adverse event conditional on use
# of prescription drugs
# Prepare the matrice of covariate (Xmat)
# Here we simulate daily exposures to 2 prescription drugs over a
# year. Drug prescriptions can start any day of follow-up, and their
# duration is a multiple of 7 days. There can be multiple prescriptions
# for each individuals over the year and interuptions of drug use in
# between.
# Additionaly, there is a time-independant binary covarite (sex).
n=500 # subjects
m=365 # days
# Generate the matrix of three covariate, in a 'long' format.
Xmat=matrix(ncol=3, nrow=n*m)
# time-independant binary covariate
Xmat[,1] <- rep(rbinom(n, 1, 0.3), each=m)
# Function to generate an individual time-dependent exposure history
# e.g. generate prescriptions of different durations and doses.
TDhist <- function(m){
start <- round(runif(1,1,m),0) # individual start date
duration <- 7 + 7*rpois(1,3) # in weeks
dose <- round(runif(1,0,10),1)
vec <- c(rep(0, start-1), rep(dose, duration))
while (length(vec)<=m){
intermission <- 21 + 7*rpois(1,3) # in weeks
duration <- 7 + 7*rpois(1,3) # in weeks
dose <- round(runif(1,0,10),1)
vec <- append(vec, c(rep(0, intermission), rep(dose, duration)))}
return(vec[1:m])}
# create TD var
Xmat[,2] <- do.call("c", lapply(1:n, function(i) TDhist(m)))
Xmat[,3] <- do.call("c", lapply(1:n, function(i) TDhist(m)))
# genereate vectors of event and censoring times prior to calling the
# function for the algorithm
eventRandom <- round(rexp(n, 0.012)+1,0)
censorRandom <- round(runif(n, 1,870),0)
# Generate the survival data conditional on the three covariates
data <- permalgorithm(n, m, Xmat, XmatNames=c("sex", "Drug1", "Drug2"),
eventRandom = eventRandom, censorRandom=censorRandom, betas=c(log(2),
log(1.04), log(0.99)), groupByD=FALSE )
# could use survival library and check whether the data was generated
# properly using coxph(Surv(Start, Stop, Event) ~ sex + Drug1 + Drug2,
# data)
# Example 2 - Generating Myocardial Infarction (MI) conditional on
# biennial measures of systolic blood pressure (like in the
# Framingham data).
m = 16 # exams
n <- 10000 # individuals
# Very crude way to generate the data, meant as an example only!
sysBP <- rnorm(n*m, 120, 15)
# by not submitting event and censor time, one let the algorithm
# generate them from uniform distributions over the follow-up time.
data2 <- permalgorithm(n, m, sysBP, XmatNames="sysBP", betas=log(1.01),
groupByD=FALSE )
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.