| fit_cat | R Documentation |
Computes maximum likelihood estimates for the parameters of an AD(p) model for categorical longitudinal data. The model is parameterized by transition probabilities, and MLEs are obtained in closed form.
fit_cat(
y,
order = 1,
blocks = NULL,
homogeneous = TRUE,
n_categories = NULL,
na_action = c("fail", "complete", "marginalize", "em"),
em_max_iter = 100,
em_tol = 1e-06,
em_epsilon = 1e-08,
em_safeguard = TRUE,
em_verbose = FALSE
)
y |
Integer matrix with n_subjects rows and n_time columns. Each entry should be a category code from 1 to c, where c is the number of categories. |
order |
Antedependence order p. Must be 0, 1, or 2. Default is 1. |
blocks |
Optional integer vector of length n_subjects specifying group membership. If NULL, all subjects are treated as one group. |
homogeneous |
Logical. If TRUE (default), parameters are shared across all groups (blocks are ignored for estimation). If FALSE, separate transition probabilities are estimated for each group. |
n_categories |
Number of categories. If NULL (default), inferred from the maximum value in y. |
na_action |
Handling of missing values in |
em_max_iter |
Maximum EM iterations used when |
em_tol |
EM convergence tolerance used when |
em_epsilon |
Numerical smoothing constant used when |
em_safeguard |
Logical; if |
em_verbose |
Logical; print EM progress when |
For AD(p), the model decomposes as:
P(Y_1, \ldots, Y_n) = P(Y_1, \ldots, Y_p) \times \prod_{k=p+1}^{n} P(Y_k | Y_{k-p}, \ldots, Y_{k-1})
MLEs are computed as empirical proportions:
Marginal/joint probabilities: count / N
Transition probabilities: conditional count / marginal count
Empty cells receive probability 0 (if denominator is also 0).
When na_action = "em", fit_cat() dispatches to
em_cat. In that case, em_safeguard controls step-halving
protection against likelihood-decreasing updates, and returned
log_l/AIC/BIC/cell_counts are synchronized via a final E-step
under the returned parameters.
For order = 2, na_action = "em" is not available and errors
explicitly; use na_action = "marginalize".
A list of class "cat_fit" containing:
marginal |
List of marginal/joint probabilities for initial time points |
transition |
List of transition probability arrays for k = p+1 to n |
log_l |
Log-likelihood at MLE |
aic |
Akaike Information Criterion |
bic |
Bayesian Information Criterion |
n_params |
Number of free parameters |
cell_counts |
List of cell counts: observed counts for closed-form fits
( |
convergence |
Optimizer convergence code (0 for closed-form solutions) |
settings |
List of model settings |
Xie, Y. and Zimmerman, D. L. (2013). Antedependence models for nonstationary categorical longitudinal data with ignorable missingness: likelihood-based inference. Statistics in Medicine, 32, 3274-3289.
# Simulate binary AD(1) data
set.seed(123)
y <- simulate_cat(n_subjects = 100, n_time = 5, order = 1, n_categories = 2)
# Fit model
fit <- fit_cat(y, order = 1)
print(fit)
# Compare orders
fit0 <- fit_cat(y, order = 0)
fit1 <- fit_cat(y, order = 1)
fit2 <- fit_cat(y, order = 2)
c(AIC_0 = fit0$aic, AIC_1 = fit1$aic, AIC_2 = fit2$aic)
# EM fit with missing data
y_miss <- y
y_miss[sample(length(y_miss), size = round(0.15 * length(y_miss)))] <- NA
fit_em <- fit_cat(
y_miss,
order = 1,
na_action = "em",
em_max_iter = 80,
em_tol = 1e-6
)
fit_em$settings$n_iter
fit_em$settings$cell_counts_type
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.