generate_syn_data: Generate synthetic data for the CausalGPS package
In CausalGPS: Matching on Generalized Propensity Scores with Continuous Exposures

View source: R/generate_synthetic_data.R

generate_syn_data

R Documentation

Generate synthetic data for the CausalGPS package

Description

Generates synthetic data set based on different GPS models and covariates.

Usage

generate_syn_data(
  sample_size = 1000,
  outcome_sd = 10,
  gps_spec = 1,
  cova_spec = 1,
  vectorized_y = FALSE
)

Arguments

`sample_size`	A positive integer number that represents a number of data samples.
`outcome_sd`	A positive double number that represents standard deviation used to generate the outcome in the synthetic data set.
`gps_spec`	A numerical integer values ranging from 1 to 7. The complexity and form of the relationship between covariates and treatment variables are determined by the `gps_spec`. Below, you will find a concise definition for each of these values: gps_spec: 1: The treatment is generated using a normal distributionMay 24, 2023 (`stats::rnorm`) and a linear function of covariates (cf1 to cf6). gps_spec: 2: The treatment is generated using a Student's t-distribution (`stats::rt`) and a linear function of covariates, but is also truncated to be within a specific range (-5 to 25). gps_spec: 3: The treatment includes a quadratic term for the third covariate. gps_spec: 4: The treatment is calculated using an exponential function within a fraction, creating logistic-like model. gps_spec: 5: The treatment also uses logistic-like model but with different parameters. gps_spec: 6: The treatment is calculated using the natural logarithm of the absolute value of a linear combination of the covariates. gps_spec: 7: The treatment is generated similarly to `gps_spec = 2`, but without truncation.
`cova_spec`	A numerical value (1 or 2) to modify the covariates. It determines how the covariates in the synthetic data set are transformed. If `cova_spec` equals 2, the function applies non-linear transformation to the covariates, which can add complexity to the relationships between covariates and outcomes in the synthetic data. See the code for more details.
`vectorized_y`	A Boolean value indicates how Y internally is generated. (Default = `FALSE`). This parameter is introduced for backward compatibility. vectorized_y = `TRUE` performs better.

Value

synthetic_data: The function returns a data.frame saved the constructed synthetic data.

Examples


set.seed(298)
s_data <- generate_syn_data(sample_size = 100,
                            outcome_sd = 10,
                            gps_spec = 1,
                            cova_spec = 1)

CausalGPS documentation built on June 22, 2024, 9:31 a.m.