estSeqMarkovOrd: estSeqMarkovOrd
In harrelfe/Hmisc: Harrell Miscellaneous

estSeqMarkovOrd

R Documentation

estSeqMarkovOrd

Description

Simulate Comparisons For Use in Sequential Markov Longitudinal Clinical Trial Simulations

Usage

estSeqMarkovOrd(
  y,
  times,
  initial,
  absorb = NULL,
  intercepts,
  parameter,
  looks,
  g,
  formula,
  ppo = NULL,
  yprevfactor = TRUE,
  groupContrast = NULL,
  cscov = FALSE,
  timecriterion = NULL,
  coxzph = FALSE,
  sstat = NULL,
  rdsample = NULL,
  maxest = NULL,
  maxvest = NULL,
  nsim = 1,
  progress = FALSE,
  pfile = ""
)

Arguments

`y`	vector of possible y values in order (numeric, character, factor)
`times`	vector of measurement times
`initial`	a vector of probabilities summing to 1.0 that specifies the frequency distribution of initial values to be sampled from. The vector must have names that correspond to values of `y` representing non-absorbing states.
`absorb`	vector of absorbing states, a subset of `y`. The default is no absorbing states. Observations are truncated when an absorbing state is simulated. May be numeric, character, or factor.
`intercepts`	vector of intercepts in the proportional odds model. There must be one fewer of these than the length of `y`.
`parameter`	vector of true parameter (effects; group differences) values. These are group 2:1 log odds ratios in the transition model, conditioning on the previous `y`.
`looks`	integer vector of ID numbers at which maximum likelihood estimates and their estimated variances are computed. For a single look specify a scalar value for `loops` equal to the number of subjects in the sample.
`g`	a user-specified function of three or more arguments which in order are `yprev` - the value of `y` at the previous time, the current time `t`, the `gap` between the previous time and the current time, an optional (usually named) covariate vector `X`, and optional arguments such as a regression coefficient value to simulate from. The function needs to allow `yprev` to be a vector and `yprev` must not include any absorbing states. The `g` function returns the linear predictor for the proportional odds model aside from `intercepts`. The returned value must be a matrix with row names taken from `yprev`. If the model is a proportional odds model, the returned value must be one column. If it is a partial proportional odds model, the value must have one column for each distinct value of the response variable Y after the first one, with the levels of Y used as optional column names. So columns correspond to `intercepts`. The different columns are used for `y`-specific contributions to the linear predictor (aside from `intercepts`) for a partial or constrained partial proportional odds model. Parameters for partial proportional odds effects may be included in the ... arguments.
`formula`	a formula object given to the `lrm()` function using variables with these name: `y`, `time`, `yprev`, and `group` (factor variable having values '1' and '2'). The `yprev` variable is converted to a factor before fitting the model unless `yprevfactor=FALSE`.
`ppo`	a formula specifying the part of `formula` for which proportional odds is not to be assumed, i.e., that specifies a partial proportional odds model. Specifying `ppo` triggers the use of `VGAM::vglm()` instead of `rms::lrm` and will make the simulations run slower.
`yprevfactor`	see `formula`
`groupContrast`	omit this argument if `group` has only one regression coefficient in `formula`. Otherwise if `ppo` is omitted, provide `groupContrast` as a list of two lists that are passed to `rms::contrast.rms()` to compute the contrast of interest and its standard error. The first list corresponds to group 1, the second to group 2, to get a 2:1 contrast. If `ppo` is given and the group effect is not just a simple regression coefficient, specify as `groupContrast` a function of a `vglm` fit that computes the contrast of interest and its standard error and returns a list with elements named `Contrast` and `SE`. For the latter type you can optionally have formal arguments `n1`, `n2`, and `parameter` that are passed to `groupContrast` to compute the standard error of the group contrast, where `n1` and `n2` respectively are the sample sizes for the two groups and `parameter` is the true group effect parameter value.
`cscov`	applies if `ppo` is not used. Set to `TRUE` to use the cluster sandwich covariance estimator of the variance of the group comparison.
`timecriterion`	a function of a time-ordered vector of simulated ordinal responses `y` that returns a vector `FALSE` or `TRUE` values denoting whether the current `y` level met the condition of interest. For example `estSeqMarkovOrd` will compute the first time at which `y >= 5` if you specify `timecriterion=function(y) y >= 5`. This function is only called at the last data look for each simulated study. To have more control, instead of `timecriterion` returning a logical vector have it return a numeric 2-vector containing, in order, the event/censoring time and the 1/0 event/censoring indicator.
`coxzph`	set to `TRUE` if `timecriterion` is specified and you want to compute a statistic for testing proportional hazards at the last look of each simulated data
`sstat`	set to a function of the time vector and the corresponding vector of ordinal responses for a single group if you want to compute a Wilcoxon test on a derived quantity such as the number of days in a given state.
`rdsample`	an optional function to do response-dependent sampling. It is a function of these arguments, which are vectors that stop at any absorbing state: `times` (ascending measurement times for one subject), `y` (vector of ordinal outcomes at these times for one subject. The function returns `NULL` if no observations are to be dropped, returns the vector of new times to sample.
`maxest`	maximum acceptable absolute value of the contrast estimate, ignored if `NULL`. Any values exceeding `maxest` will result in the estimate being set to `NA`.
`maxvest`	like `maxest` but for the estimated variance of the contrast estimate
`nsim`	number of simulations (default is 1)
`progress`	set to `TRUE` to send current iteration number to `pfile` every 10 iterations. Each iteration will really involve multiple simulations, if `parameter` has length greater than 1.
`pfile`	file to which to write progress information. Defaults to `''` which is the console. Ignored if `progress=FALSE`.

Details

Simulates sequential clinical trials of longitudinal ordinal outcomes using a first-order Markov model. Looks are done sequentially after subject ID numbers given in the vector looks with the earliest possible look being after subject 2. At each look, a subject's repeated records are either all used or all ignored depending on the sequent ID number. For each true effect parameter value, simulation, and at each look, runs a function to compute the estimate of the parameter of interest along with its variance. For each simulation, data are first simulated for the last look, and these data are sequentially revealed for earlier looks. The user provides a function g that has extra arguments specifying the true effect of parameter the treatment group expecting treatments to be coded 1 and 2. parameter is usually on the scale of a regression coefficient, e.g., a log odds ratio. Fitting is done using the rms::lrm() function, unless non-proportional odds is allowed in which case VGAM::vglm() is used. If timecriterion is specified, the function also, for the last data look only, computes the first time at which the criterion is satisfied for the subject or use the event time and event/censoring indicator computed by timecriterion. The Cox/logrank chi-square statistic for comparing groups on the derived time variable is saved. If coxzph=TRUE, the survival package correlation coefficient rho from the scaled partial residuals is also saved so that the user can later determine to what extent the Markov model resulted in the proportional hazards assumption being violated when analyzing on the time scale. vglm is accelerated by saving the first successful fit for the largest sample size and using its coefficients as starting value for further vglm fits for any sample size for the same setting of parameter.

Value

a data frame with number of rows equal to the product of nsim, the length of looks, and the length of parameter, with variables sim, parameter, look, est (log odds ratio for group), and vest (the variance of the latter). If timecriterion is specified the data frame also contains loghr (Cox log hazard ratio for group), lrchisq (chi-square from Cox test for group), and if coxph=TRUE, phchisq, the chi-square for testing proportional hazards. The attribute etimefreq is also present if timecriterion is present, and it probvides the frequency distribution of derived event times by group and censoring/event indicator. If sstat is given, the attribute sstat is also present, and it contains an array with dimensions corresponding to simulations, parameter values within simulations, id, and a two-column subarray with columns group and y, the latter being the summary measure computed by the sstat function. The returned data frame also has attribute lrmcoef which are the last-look logistic regression coefficient estimates over the nsim simulations and the parameter settings, and an attribute failures which is a data frame containing the variables reason and frequency cataloging the reasons for unsuccessful model fits.

Author(s)

Frank Harrell

harrelfe/Hmisc
Harrell Miscellaneous

estSeqMarkovOrd: estSeqMarkovOrd
In harrelfe/Hmisc: Harrell Miscellaneous