View source: R/get_simpsons_paradox_c.R
get_simpsons_paradox_c | R Documentation |
This function simulates the Simpson's Paradox phenomenon by transforming data using Gaussian copulas, optimizing the transformation with simulated annealing, and comparing the results.
get_simpsons_paradox_c(
x,
y,
z,
corr_vector,
inv_cdf_type = "quantile_7",
sd_x = 0.05,
sd_y = 0.05,
lambda1 = 1,
lambda2 = 1,
lambda3 = 1,
lambda4 = 1,
max_iter = 1000,
initial_temp = 1,
cooling_rate = 0.99,
order_vec = NA,
degree = 5
)
x |
A numeric vector of data points for variable X. |
y |
A numeric vector of data points for variable Y. |
z |
A categorical variable representing groups (e.g., factor or character vector). |
corr_vector |
A vector of correlations for each category of z. |
inv_cdf_type |
Type of inverse CDF transformation ("quantile_1", "quantile_4", "quantile_7", "quantile_8", "linear", "akima", "poly"). Default is "quantile_7". |
sd_x |
Standard deviation for perturbations on X (default is 0.05). |
sd_y |
Standard deviation for perturbations on Y (default is 0.05). |
lambda1 |
Regularization parameter for simulated annealing (default is 1). |
lambda2 |
Regularization parameter for simulated annealing (default is 1). |
lambda3 |
Regularization parameter for simulated annealing (default is 1). |
lambda4 |
Regularization parameter for simulated annealing (default is 1). |
max_iter |
Maximum iterations for simulated annealing (default is 1000). |
initial_temp |
Initial temperature for simulated annealing (default is 1.0). |
cooling_rate |
Cooling rate for simulated annealing (default is 0.99). |
order_vec |
Manual ordering of grids (default is NA, calculated automatically if not specified). |
degree |
Degree of polynomial used for polynomial inverse CDF (default is 5). |
A list containing:
df_all |
The final dataset with original, transformed, and annealed data. |
df_res |
A simplified version with only the optimized data. |
set.seed(123)
n <- 300
z <- sample(c("A", "B", "C"), prob = c(0.3, 0.4, 0.3), size = n, replace = TRUE)
x <- rnorm(n, 10, sd = 5) + 5 * rbeta(n, 5, 3)
y <- 2 * x + rnorm(n, 5, sd = 4)
t <- c(-0.8, 0.8, -0.8)
res <- get_simpsons_paradox_c(x, y, z, t, sd_x = 0.07, sd_y = 0.07, lambda4 = 5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.