eesim: Simulate data, fit models, and assess models

Description Usage Arguments Value References Examples

View source: R/customizable_functions.R

Description

Generates synthetic time series datasets relevant for environmental epidemiology studies and tests performance of a model on that simulated data. Datasets can be generated with seasonal and long-term trends in either exposure or outcome. Binary or continuous outcomes can be simulated or incorporated from observed datasets. The function includes extensive options for customizing each step of the simulation process; see the eesim vignette for more details and examples.

Usage

1
2
3
4
5
6
7
8
eesim(n_reps, n, rr, exposure_type, custom_model, central = NULL, sd = NULL,
  exposure_trend = "no trend", exposure_slope = NULL, exposure_amp = NULL,
  average_outcome = NULL, outcome_trend = "no trend",
  outcome_slope = NULL, outcome_amp = NULL, start.date = "2000-01-01",
  cust_exp_func = NULL, cust_exp_args = NULL, cust_expdraw = NULL,
  cust_expdraw_args = NULL, cust_base_func = NULL,
  cust_lambda_func = NULL, cust_base_args = NULL, cust_lambda_args = NULL,
  cust_outdraw = NULL, cust_outdraw_args = NULL, custom_model_args = NULL)

Arguments

n_reps

An integer specifying the number of datasets to simulate (e.g., n_reps = 1000 would simulate one thousand time series datasets with the specified characteristics, which can be used for a power analysis or to investigate the performance of a proposed model).

n

An integer specifying the number of days to simulate (e.g., n = 365 would simulate a dataset with a year's worth of data).

rr

A non-negative numeric value specifying the relative risk (i.e., the relative risk per unit increase in the exposure).

exposure_type

A character string specifying the type of exposure. Choices are "binary" or "continuous".

custom_model

The object name of an R function that defines the code that will be used to fit the model. This object name should not be in quotations. See Details for more.

central

A numeric value specifying the mean probability of exposure (for binary data) or the mean exposure value (for continuous data).

sd

A non-negative numeric value giving the standard deviation of the exposure values from the exposure trend line (not the total standard deviation of the exposure values).

exposure_trend

A character string specifying a seasonal and / or long-term trend for expected mean exposure. See the vignette for eesim for examples of each option. The shapes are based on those used in Bateson and Schwartz (1999). For trends with a seasonal component, the amplitude of the seasonal trend can be customized using the exposure\_amp argument. For trends with a long-term pattern, the slope of the long-term trend can be set using the exposure\_slope argument. If using the "monthly" option for a binary exposure, you must input a numeric vector of length 12 for the central argument that gives the probability of exposure for each month, starting in January and ending in December. Options for continuous exposure are:

  • "no trend": No trend, either seasonal or long-term (default).

  • "cos1": A seasonal trend only.

  • "cos2": A seasonal trend with variable amplitude across years.

  • "cos3": A seasonal trend with steadily decreasing amplitude over time.

  • "linear": A linear long-term trend with no seasonal trend.

  • "curvilinear": A curved long-term trend with no seasonal trend.

  • "cos1linear": A seasonal trend plus a linear long-term trend.

Options for binary exposure are:

  • "no trend": No trend, either seasonal or long-term (default).

  • "cos1": A seasonal trend only.

  • "cos2": A seasonal trend with variable amplitude across years.

  • "cos3": A seasonal trend with steadily decreasing amplitude over time.

  • "linear": A linear long-term trend with no seasonal trend.

  • "monthly": Uses a user-specified probability of exposure for each month.

exposure_slope

A numeric value specifying the linear slope of the exposure, to be used with exposure_trend = "linear" or exposure_trend = "cos1linear". The default value is 1. Positive values will generate data with an increasing expected value over the years while negative values will generate data with decreasing expected value over the years.

exposure_amp

A numeric value specifying the amplitude of the exposure trend. Must be between -1 and 1 for continuous exposure or between -0.5 and 0.5 for binary exposure. Positive values will simulate a pattern with higher values at the time of the year of the start of the dataset (typically January) and lowest values six months following that (typically July). Negative values can be used to simulate a trend with lower values at the time of year of the start of the dataset and higher values in the opposite season.

average_outcome

A non-negative numeric value specifying the average daily outcome count.

outcome_trend

A character string specifying the seasonal trend in health outcomes. Options are the same as for continuous exposure data.

outcome_slope

A numeric value specifying the linear slope of the outcome trend, to be used with outcome_trend = "linear" or outcome_trend = "cos1linear". The default value is 1. Positive values will generate data with an increasing expected value over the years while negative values will generate data with decreasing expected value over the years.

outcome_amp

A numeric value specifying the amplitude of the outcome trend. Must be between -1 and 1.

start.date

A date of the format "yyyy-mm-dd" from which to begin simulating daily exposures

cust_exp_func

An R object name specifying the name of a custom trend function to generate exposure data

cust_exp_args

A list of arguments and their values for the user-specified custom exposure function.

cust_expdraw

An R object specifying a user-created function which determines the distribution of random noise off of the trend line. This function must have inputs n and prob for a binary exposure function and inputs n and mean for a continuous exposure function. The custom function must output a vector of simulated exposure values.

cust_expdraw_args

A list of arguments other than n required by the cust_expdraw function.

cust_base_func

A R object name specifying a user-made custom function for baseline trend.

cust_lambda_func

An R object name specifying a user-made custom function for relating baseline, relative risk, and exposure

cust_base_args

A list of arguments and their values used in the user-specified custom baseline function

cust_lambda_args

A list of arguments and their values used in the user-specified custom lambda function

cust_outdraw

An R object name specifying a user-created function to randomize the outcome values off of the baseline for outcome values. This function must take inputs n and lambda and output a vector of outcome values.

cust_outdraw_args

A list of arguments besides n passed to the user-created custom outcome draw function.

custom_model_args

A list of arguments and their values for a custom model. These arguments are passed through to the function specified with custom_model.

Value

A list object with three elements:

References

Bateson TF, Schwartz J. 1999. Control for seasonal variation and time trend in case-crossover studies of acute effects of environmental exposures. Epidemiology 10(4):539-544.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Run a simulation for a continuous exposure (mean = 100, standard
# deviation after long-term and seasonal trends = 10) that increases
# risk of a count outcome by 0.1% per unit increase, where the average
# daily outcome is 22 per day. The exposure outcome has a seasonal trend,
# with higher values in the winter, while the outcome has no seasonal
# or long-term trends beyond those introduced through effects from the
# exposure. The simulated data are fit with a model defined by the `spline_mod`
# function (also in the `eesim` package), with its `df_year` argument set to 7.

sims <- eesim(n_reps = 3, n = 5 * 365, central = 100, sd = 10,
      exposure_type = "continuous", exposure_trend = "cos3",
      exposure_amp = .6, average_outcome = 22, rr = 1.001,
      custom_model = spline_mod, custom_model_args = list(df_year = 7))
names(sims)
sims[[2]]
sims[[3]]

eesim documentation built on May 2, 2019, 7:30 a.m.