validate_glm_initialization_input: Validate Inputs for Catalytic Generalized Linear Models...

validate_glm_initialization_inputR Documentation

Validate Inputs for Catalytic Generalized Linear Models (GLMs) Initialization

Description

This function validates the input parameters required for initializing a catalytic Generalized Linear Model (GLM). It ensures the appropriate structure and compatibility of the formula, family, data, and additional parameters before proceeding with further modeling.

Usage

validate_glm_initialization_input(
  formula,
  family,
  data,
  syn_size,
  custom_variance,
  gaussian_known_variance,
  x_degree
)

Arguments

formula

A formula object specifying the stats::glm model to be fitted. It must not contain random effects or survival terms.

family

A character or family object specifying the error distribution and link function. Valid values are "binomial" and "gaussian".

data

A data.frame containing the data to be used in the GLM.

syn_size

A positive integer specifying the sample size used for the synthetic data.

custom_variance

A positive numeric value for the custom variance used in the model (only applicable for Gaussian family).

gaussian_known_variance

A logical indicating whether the variance is known for the Gaussian family.

x_degree

A numeric vector specifying the degree of the predictors. Its length should match the number of predictors (excluding the response variable).

Details

This function performs the following checks:

  • Ensures that syn_size, custom_variance, and x_degree are positive values.

  • Verifies that the provided formula is suitable for GLMs, ensuring no random effects or survival terms.

  • Checks that the provided data is a data.frame.

  • Confirms that the formula does not contain too many terms relative to the number of columns in data.

  • Ensures that the family is either "binomial" or "gaussian".

  • Validates that x_degree has the correct length relative to the number of predictors in data.

  • Warns if syn_size is too small relative to the number of columns in data.

  • Issues warnings if custom_variance or gaussian_known_variance are used with incompatible families. If any of these conditions are not met, the function raises an error or warning to guide the user.

Value

Returns nothing if all checks pass; otherwise, raises an error or warning.


catalytic documentation built on April 4, 2025, 5:51 a.m.