| simulate_data_fmrcc | R Documentation |
#' @description Generates synthetic in-control and out-of-control functional data for testing the Functional Mixture Regression Control Chart (FMRCC) framework. The function simulates a functional response Y influenced by a functional covariate X through a mixture of functional linear models (FLMs) with three distinct regression structures, as described in Section 3.1 of Capezza et al. (2025).
simulate_data_fmrcc(
n_obs = 3000,
mixing_prop = c(1/3, 1/3, 1/3),
len_grid = 500,
SNR = 4,
shift_coef = c(0, 0, 0, 0),
severity = 0,
ncompx = 20,
delta_1,
delta_2,
measurement_noise_sigma = 0,
fun_noise = "normal",
df = 3,
alphasn = 4
)
n_obs |
Integer. Total number of observations to generate. Default is 3000. |
mixing_prop |
Numeric vector of length 3. Mixing proportions for the three clusters (must sum to 1). Default is c(1/3, 1/3, 1/3). |
len_grid |
Integer. Number of grid points for evaluating functional data on domain [0,1]. Default is 500. |
SNR |
Numeric. Signal-to-noise ratio controlling the variance of the error term. Default is 4. |
shift_coef |
Numeric vector of length 4 or character string. Controls the type and shape of the mean shift:
Default is c(0,0,0,0) (no shift). |
severity |
Numeric. Multiplier controlling the magnitude of the shift. Higher values produce larger shifts. This corresponds to the "Severity Level (SL)" in the simulation study. Default is 0 (no shift). |
ncompx |
Integer. Number of functional principal components used to generate the functional covariate X. Default is 20. |
delta_1 |
Numeric in [0,1]. Controls dissimilarity between clusters in regression coefficient functions and functional intercepts (analogous to delta_1 in simulate_data_fmrcc). Required parameter with no default. |
delta_2 |
Numeric in [0,1]. Controls the relative contribution of functional intercept vs. regression coefficient function (analogous to delta_2 in simulate_data_fmrcc). Required parameter with no default. |
measurement_noise_sigma |
Numeric. Standard deviation of Gaussian measurement error added to both X and Y. Default is 0 (no measurement error). |
fun_noise |
Character. Distribution for functional error term. Options:
|
df |
Numeric. Degrees of freedom for Student's t-distribution when
|
alphasn |
Numeric. Skewness parameter for skew-normal distribution when
|
The data generation follows Equation (18) in the paper:
Y(t) = (1 - \Delta_2)\beta^0_k(t) + \int_S \Delta_2(\beta^X_k(s,t))^T X(s)ds + \varepsilon(t)
The three clusters are characterized by:
Different functional intercepts \beta^0_k(t) (inspired by dynamic resistance
curves in RSW processes)
Different bivariate regression coefficient functions \beta^X_k(s,t)
Functional errors with variance adjusted to achieve the specified SNR
Moreover, when when severity != 0, it applies a controlled shift to the functional response Y to
simulate out-of-control conditions. The shift types include:
Polynomial shifts: When shift_coef is numeric, a polynomial of degree 3 is
applied: Shift(t) = severity \times (a_3 t^3 + a_2 t^2 + a_1 t + a_0)
Linear shift example: shift_coef = c(0, 0, 1, 0) produces a linear shift
Quadratic shift example: shift_coef = c(0, 1, 0, 0) produces a quadratic shift
RSW-specific shifts: When shift_coef = 'low' or 'high', the function applies
shifts based on modifications to the dynamic resistance curve (DRC) parameters,
simulating realistic fault patterns in resistance spot welding processes.
The functional covariate X is generated using functional principal component analysis
with standardized magnitudes (scaled by 1/5).
A list containing:
X |
Matrix ( |
Y |
Matrix ( |
Eps_1, Eps_2, Eps_3 |
Matrices of functional error terms for each cluster. |
beta_matrix_1, beta_matrix_2, beta_matrix_3 |
Matrices ( |
Capezza, C., Centofanti, F., Forcina, D., Lepore, A., and Palumbo, B. (2025). Functional Mixture Regression Control Chart. Annals of Applied Statistics.
# Generate in-control data with three equally-sized clusters, maximum dissimilarity
data <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 0.5, severity = 0)
# In-control single cluster case (delta_1 = 0)
data_single <- simulate_data_fmrcc(n_obs = 300, delta_1 = 0, delta_2 = 0.5, severity = 0)
# In-control clusters differing only in regression coefficients
data_beta_only <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 1, severity = 0)
# Add measurement noise and use t-distributed errors
data_t_noise <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 0.5, severity = 0,
measurement_noise_sigma = 0.01,
fun_noise = 't', df = 5)
# Generate out-of-control data with linear shift
data_oc <- simulate_data_fmrcc(n_obs = 300,
shift_coef = c(0, 0, 1, 0),
severity = 2,
delta_1 = 1,
delta_2 = 0.5)
# Generate OC data with quadratic shift
data_quad <- simulate_data_fmrcc(n_obs = 300,
shift_coef = c(0, 1, 0, 0),
severity = 3,
delta_1 = 1,
delta_2 = 0.5)
# Generate OC data with RSW-specific "low" shift pattern
data_rsw_low <- simulate_data_fmrcc(n_obs = 300,
shift_coef = 'low',
severity = 1.5,
delta_1 = 1,
delta_2 = 0.5)
# Generate OC data with RSW-specific "high" shift pattern
data_rsw_high <- simulate_data_fmrcc(n_obs = 300,
shift_coef = 'high',
severity = 2,
delta_1 = 0.66,
delta_2 = 0.5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.