simdist: Function that uses models fitted to the original data to...
In bokov/powertrip: Analysis of longevity data.

Description Usage Arguments Details Value Warning Note Author(s) References See Also

This function identifies the best-fit parameters for a sample, then uses these parameters to obtain null distributions of MLE log-ratios by sampling from the population that this sample would represent if each candidate model were, in turn, correct.

1	simdist(x, label, cx = NULL, nil = 0, bnil = 0, wbnil = 1, pf = mean, rounds = 5000, models = c("w", "g", "gm", "l", "lm"), dropcols = c("a2", "b2", "c2", "s2"), sims = NULL, pars = NULL)

`x`	An integer vector of ages at death.
`label`	A character string representing the base name for the output files. Each output file has the same format as the output from `findpars`, stacked together by rows, representing every simulation. There is a separate such output file for each model selected (in the `models` argument) and its name is 'LABEL_M' where 'LABEL' is the value of this argument and 'M' is the abbreviation of the model name ('w','g','gm','l', or 'lm').
`cx`	A vector of ones and zeros, with ones representing a natural death event, and zeros representing a censored event. If omitted, it is assumed that all the deaths are natural in the control group.
`nil`	The numeric value to substitute missing values of fitted parameters in order to avoid rounding errors caused by machine precision limits. That problem has since been solved and this argument may be removed in future versions.
`bnil`	The numeric value to substitute for missing values of the 'b' parameter in order to avoid rounding errors caused by machine precision limits. That problem has since been solved and this argument may be removed in future versions.
`wbnil`	The numeric value to substitute for missing values of the 'b' parameter when finding parameters for the Weibull model. Defaults to 1.
`pf`	A function to call when two different parameters are constrained and need to produce a single starting value. In addition to `mean` (default), other valid functions include `median`, `gmean`, `max`, and `min`.
`rounds`	How many samples to simulate.
`models`	A character vector of models to try which can be any combination of: 'w','g','gm','l', or 'lm'. In some cases, models that are not listed in this argument are fitted anyway because they are needed to obtain starting parameters for the models that are listed.
`dropcols`	Columns to drop from the output. By default this is `c('a2','b2','c2','s2')` because `simdist` is designed for just one sample rather than a comparison of two samples, and therefore those columns will never be used by this function. The name of any unwanted output columns can be included in this argument however.
`sims`	Simulating samples is a time consuming process. If the simulations from a previous run exist as an R object in the current environment, that object can be specified in the `sims` argument, causing the simulation step to be bypassed and all the simulated data to instead be taken from the object.
`pars`	Similarly to the `sims` argument, `pars` allows you to specify an already calculated collection of parameters. This feature is experimental.

simdist calls findpars on a sample in order to obtain estimates of each parameter for each model of interest as well as log-ratios of MLEs for the respective model comparisons. This function then simulates samples of the same size based on the fitted parameters, and does this rounds times. Then, findpars is run again on each model of interest, to obtain a distribution of MLE log-ratios. The original MLE log-ratio is compared against this distribution, in order to obtain a more robust estimate of significance level.

This function does not return anything to the console, but instead saves its output as files. The following files are generated (replacing LABEL with the value of the label argument):

`LABEL.rdata`	An R data file containing the output from the initial (non-simulated) `findpars` in a data.frame called `ihaz`.
`LABEL.sims.rdata`	An R data file with all the simulated samples in a list object called `sims` containing one N x rounds matrix of survival times for each model used as the null hypothesis. It also contains a list object called `pars` that contains the parameter estimates from all the fitted models labeled AB where A is the null model and B is the alternative model. This file can be opened from R using the `load()` command but is not usually necessary unless one is trying to recreate the results of a previous simulation.
`LABEL_M`	A collection of tab delimited text files is generated, one for each model tested against its respective null hypothesis models. The 'M' in the label name is replaced with the abbreviation of the model name (by default, 'g', 'gm', 'l', and 'lm'). These files are like those produced by `findpars()`

except they lack any of the columns named in the dropcols argument and they have the following additional columns:

`null_pars.a1`	The best fit for the `a` parameter of the null model.
`null_pars.b1`	The best fit for the `b` parameter (if applicable) of the null model.
`null_pars.c1`	The best fit for the `c` parameter (if applicable) of the null model.
`null_pars.s1`	The best fit for the `s` parameter (if applicable) of the null model.
`null_model`	An abbreviation identifying the model representing the null hypothesis (i.e. a model that has one less parameter than the target model, which is the one that is being compared to it).
`target_model`	An abbreviation identifying the model representing the alternative hypothesis (i.e. a model that has one more parameter than the null model, which it is being compared to). The values in the 'id' column correspond to which column (from the left) in the corresponding 'sims' object for that model comparison contains the data that produced that given row.

These examples may take 10 minutes or more to finish running.

Even though the examples use 'rounds=100' for demonstration purposes, for publication it is recommended to leave the 'rounds' argument at its default of 5000 or give it a larger value. This may take several hours or even days depending on the dataset. This is why the output is saved to a text file rather than an object within R– the user can monitor the progress of this function by accessing the data file with constrshow from another R session.

Currently this function spams the console with periods and abbreviated model names. This behavior is intentional, for troubleshooting purposes and to indicate that R has not crashed or hung.

Alex F. Bokov

Pletcher,S.D., Khazaeli,A.A., and Curtsinger,J.W. (2000). Why do life spans differ? Partitioning mean longevity differences in terms of age-specific mortality parameters. Journals of Gerontology Series A-Biological Sciences and Medical Sciences 55, B381-B389

findpars, modelshow

bokov/powertrip documentation built on May 12, 2019, 11:33 p.m.