# survplot: Plot and get survival data from a multi-state model In contefranz/msmtools: Building Augmented Data to Run Multi-State Models with 'msm' Package

## Description

Plot a Kaplan-Meier curve and compare it with the fitted survival probability computed from a `msm` model. Fast builds and returns the associated datasets.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10``` ```survplot(x, from = 1, to = NULL, range = NULL, covariates = "mean", exacttimes = TRUE, times, grid = 100L, km = FALSE, return.all = FALSE, return.km = NULL, return.p = NULL, convert = FALSE, add = FALSE, ci = c("none", "normal", "bootstrap"), interp = c("start", "midpoint"), B = 100L, legend.pos = "topright", xlab = "Time", ylab = "Survival Probability", main = NULL, lty.fit = 1, lwd.fit = 1, col.fit = "red", lty.ci.fit = 3, lwd.ci.fit = 1, col.ci.fit = col.fit, mark.time = FALSE, lty.km = 5, lwd.km = 1, col.km = "darkblue", do.plot = TRUE, plot.width = 7, plot.height = 7, devnew = TRUE, verbose = TRUE) ```

## Arguments

 `x` A `msm` object. `from` State from which to compute the estimated survival. Default to state 1. `to` The absorbing state to which compute the estimated survival. Default to the highest state found by `absorbing.msm`. `range` A numeric vector of two elements which gives the time range of the plot. `covariates` Covariate values for which to evaluate the expected probabilities. These can either be: the string `"mean"`, denoting the means of the covariates in the data (this is the default), the number 0, indicating that all the covariates should be set to zero, or a list of values, with optional names. For example: `list (75, 1)` where the order of the list follows the order of the covariates originally given in the model formula, or a named list: `list (age = 75, gender = "M")`. `exacttimes` If `TRUE` (default) then transition times are known and exact. This is inherited from `msm` and should be set the same way. `times` An optional numeric vector giving the times at which to compute the fitted survival. `grid` An integer which tells at how many points to compute the fitted survival (see 'Details'). If `times` is passed, `grid` is ignored. It has a default of 100 points. `km` If `TRUE`, then the Kaplan-Meier curve is shown. Default is `FALSE`. `return.all` If `TRUE`, then all the datasets used to draw the plot will be return to the environment. This argument saves you some typing time since you do not have to pass neither `return.km` nor `return.p`. Default is `FALSE` (see 'Details'). `return.km` If `TRUE`, then the dataset used for building the Kaplan-Meier is returned as an object of class `data.table` unless `convert` is set to `TRUE` (see `convert`). Default is `FALSE`. `survplot` must be assigned to an object in order to get the data in the environment (see 'Details'). `return.p` If `TRUE`, then the dataset used for building the fitted survival curve is returned as an object of class `data.table` unless `convert` is set to `TRUE` (see `convert`). Default is `FALSE`. `survplot` must be assigned to an object in order to get the data in the environment (see 'Details'). `convert` If `TRUE`, then any returned object is automatically converted to the class `data.frame`. This is done in place and comes at very low cost both from running time and memory consumption (see `setDF`). `add` If `TRUE`, then a new layer is added to the current plot. Default is `FALSE`. `ci` If `"none"` (the default), then no confidence intervals are plotted. If `"normal"` or `"bootstrap"`, confidence intervals are plotted based on the respective method in `pmatrix.msm`. This is very computationally-intensive, since intervals must be computed at a series of times. `interp` If `"start"` (the default), then the entry time into the absorbing state is assumed to be the time it is first observed in the data. If `"midpoint"`, then the entry time into the absorbing state is assumed to be halfway between the time it is first observed and the previous observation time. This is generally more reasonable for "progressive" models with observations at arbitrary times. `B` Number of bootstrap or normal replicates for the confidence interval. The default is 100 rather than the usual 1000, since these plots are for rough diagnostic purposes. `legend.pos` Where to position the legend. Default is `"topright"`, but x and y coordinate can be passed. If `NULL`, then legend is not shown. `xlab` x axis label. `ylab` y axis label. `main` The main title of the plot(s) as character. Default is `NULL`. `lty.fit` Line type for the fitted curve. See `par`. `lwd.fit` Line width for the fitted curve. See `par`. `col.fit` Line color for the fitted curve. See `par`. `lty.ci.fit` Line type for the fitted curve confidence limits. See `par`. `lwd.ci.fit` Line width for the fitted curve confidence limits. See `par`. `col.ci.fit` Line color for the fitted curve confidence limits. See `par`. `mark.time` Mark the empirical survival curve at each censoring point. See `lines.survfit`. `lty.km` Line type for the Kaplan-Meier passed to `lines.survfit`. See `par`. `lwd.km` Line width for the Kaplan-Meier passed to `lines.survfit`. See `par`. `col.km` Line color for the Kaplan-Meier passed to `lines.survfit`. See `par`. `do.plot` If `FALSE`, then no plot is shown at all. Default is `TRUE`. `plot.width` Width of new graphical device. Default is 7. See `par`. `plot.height` Height of new graphical device. Default is 7. See `par`. `devnew` Set the graphical device where to plot. By default, `survplot` plots on a new device by setting `dev.new`. If `FALSE`, then a plot is drawn onto the current device as specified by `dev.cur`. If `FALSE` and no external devices are opened, then a plot is drawn using internal graphics. See `dev`. `verbose` If `FALSE`, all information produced by `print`, `cat` and `message` are suppressed. Default is `TRUE`.

## Details

The function is a wrapper of `plot.survfit.msm` and does more things. `survplot` manages correctly the plot of a fitted survival in an exact times framework (when `exacttimes = TRUE`) by just resetting the time scale and looking at the follow-up time. It can fastly build and return to the user the datasets used to compute the Kaplan-Meier and the fitted survival by setting `return.all = TRUE`. When this is `TRUE`, setting `return.km` or `return.p` to `FALSE` produces an error and `survplot` does not conclude the job. If these are set to `TRUE`, a warning is raised but the job is taken to the end. For more details about how `survplot` returns objects, please refer to the vignette with `vignette("msmtools")`.

The user can defined custom times (through `times`) or let `survplot` choose them on its own (through `grid`). In the latter case, `survplot` looks for the follow-up time and divides it by `grid`. The higher it is, the finer the grid will be so that computing the fitted survival will take longer, but will be more precise.

## Value

If `return.all` is set to `TRUE`, then `survplot` returns a named list with `\$km` and `\$fitted` as `data.table` or as `data.frame` when `convert = TRUE`. To save them in the working environment, assign `survplot` to an object (see 'Examples').

`\$km` contains up to 4 columns:

 `subject` The ordered subject ID as passed in the `msm` function. `mintime` The times at which to compute the fitted survival. `mintime_exact` If `exacttimes` is `TRUE`, then the relative timing is reported. `anystate` State of transition to compute the Kaplan-Meier.

——

`\$fitted` contains 2 columns:

 `time` Times at which to compute the fitted survival. `probs` The corresponding values of the fitted survival.

## Author(s)

Francesco Grossetti francesco.grossetti@unibocconi.it.

## References

Titman, A. and Sharples, L.D. (2010). Model diagnostics for multi-state models, Statistical Methods in Medical Research, 19, 621-651.

Titman, A. and Sharples, L.D. (2008). A general goodness-of-fit test for Markov and hidden Markov models, Statistics in Medicine, 27, 2177-2195.

Jackson, C.H. (2011). Multi-State Models for Panel Data:
The msm Package for R. Journal of Statistical Software, 38(8), 1-29.
URL http://www.jstatsoft.org/v38/i08/.

`plot.survfit.msm` `msm`, `pmatrix.msm`, `setDF`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42``` ```## Not run: data( hosp ) # augmenting the data hosp_augmented = augment( data = hosp, data_key = subj, n_events = adm_number, pattern = label_3, t_start = dateIN, t_end = dateOUT, t_cens = dateCENS ) # let's define the initial transition matrix for our model Qmat = matrix( data = 0, nrow = 3, ncol = 3, byrow = TRUE ) Qmat[ 1, 1:3 ] = 1 Qmat[ 2, 1:3 ] = 1 colnames( Qmat ) = c( 'IN', 'OUT', 'DEAD' ) rownames( Qmat ) = c( 'IN', 'OUT', 'DEAD' ) # attaching the msm package and running the model using # gender and age as covariates library( msm ) msm_model = msm( status_num ~ augmented_int, subject = subj, data = hosp_augmented, covariates = ~ gender + age, exacttimes = TRUE, gen.inits = TRUE, qmatrix = Qmat, method = 'BFGS', control = list( fnscale = 6e+05, trace = 0, REPORT = 1, maxit = 10000 ) ) # plotting the fitted and empirical survival from state = 1 survplot( msm_model, km = TRUE, ci = 'none', verbose = FALSE ) # plotting the fitted and empirical survival from state = 2 and # adding it to the previous plot survplot( msm_model, from = 2, km = TRUE, ci = 'none', add = TRUE, verbose = FALSE ) # returning fitted and empirical data all_data = survplot( msm_model, ci = 'none', return.all = TRUE, verbose = FALSE, do.plot = FALSE ) # saving them separately km_data = all_data[[ 1 ]] fitted_data = all_data[[ 2 ]] ## End(Not run) ```