lpmec_onerun: lpmec_onerun

View source: R/lpme_DoOneRun.R

lpmec_onerunR Documentation

lpmec_onerun

Description

Implements analysis for latent variable models with measurement error correction

Usage

lpmec_onerun(
  Y,
  observables,
  observables_groupings = colnames(observables),
  make_observables_groupings = FALSE,
  estimation_method = "em",
  latent_estimation_fn = NULL,
  mcmc_control = list(backend = "pscl", n_samples_warmup = 500L, n_samples_mcmc = 1000L,
    batch_size = 512L, chain_method = "parallel", subsample_method = "full", n_thin_by =
    1L, n_chains = 2L),
  ordinal = FALSE,
  conda_env = "lpmec",
  conda_env_required = FALSE
)

Arguments

Y

A vector of observed outcome variables

observables

A matrix of observable indicators used to estimate the latent variable

observables_groupings

A vector specifying groupings for the observable indicators. Default is column names of observables.

make_observables_groupings

Logical. If TRUE, creates dummy variables for each level of the observable indicators. Default is FALSE.

estimation_method

Character specifying the estimation approach. Options include:

  • "em" (default): Uses expectation-maximization via emIRT package. Supports both binary (via emIRT::binIRT) and ordinal (via emIRT::ordIRT) indicators.

  • "pca": First principal component of observables.

  • "averaging": Uses feature averaging.

  • "mcmc": Markov Chain Monte Carlo estimation using either pscl::ideal (R backend) or numpyro (Python backend)

  • "mcmc_joint": Joint Bayesian model that simultaneously estimates latent variables and outcome relationship using numpyro

  • "mcmc_overimputation": Two-stage MCMC approach with measurement error correction via over-imputation

  • "custom": In this case, latent estimation performed using latent_estimation_fn.

latent_estimation_fn

Custom function for estimating latent trait from observables if estimation_method="custom" (optional). The function should accept a matrix of observables (rows are observations) and return a numeric vector of length equal to the number of observations.

mcmc_control

A list indicating parameter specifications if MCMC used.

backend

Character string indicating the MCMC engine to use. Valid options are "pscl" (default, uses the R-based pscl::ideal function) or "numpyro" (uses the Python numpyro package via reticulate).

n_samples_warmup

Integer specifying the number of warm-up (burn-in) iterations before samples are collected. Default is 500.

n_samples_mcmc

Integer specifying the number of post-warmup MCMC iterations to retain. Default is 1000.

chain_method

Character string passed to numpyro specifying how to run multiple chains. Options: "parallel" (default), "sequential", or "vectorized".

n_thin_by

Integer indicating the thinning factor for MCMC samples. Default is 1.

n_chains

Integer specifying the number of parallel MCMC chains to run. Default is 2.

ordinal

Logical indicating whether the observable indicators are ordinal (TRUE) or binary (FALSE).

conda_env

A character string specifying the name of the conda environment to use via reticulate. Default is "lpmec".

conda_env_required

A logical indicating whether the specified conda environment must be strictly used. If TRUE, an error is thrown if the environment is not found. Default is FALSE.

Details

This function implements a latent variable analysis with measurement error correction. It splits the observable indicators into two sets, estimates latent variables using each set, and then applies various correction methods including OLS correction and instrumental variable approaches.

Value

A list containing various estimates and statistics:

  • ols_coef: Coefficient from naive OLS regression

  • ols_se: Standard error of naive OLS coefficient

  • ols_tstat: T-statistic of naive OLS coefficient

  • iv_coef_a: IV coefficient using first split as instrument

  • iv_coef_b: IV coefficient using second split as instrument

  • iv_coef: Averaged IV coefficient from both splits

  • iv_se: Standard error of IV regression coefficient

  • iv_tstat: T-statistic of IV regression coefficient

  • corrected_iv_coef_a: Corrected IV coefficient using first split as instrument

  • corrected_iv_coef_b: Corrected IV coefficient using second split as instrument

  • corrected_iv_coef: Averaged corrected IV coefficient from both splits

  • corrected_iv_se: Standard error of corrected IV coefficient

  • corrected_iv_tstat: T-statistic of corrected IV coefficient

  • corrected_ols_coef_a: Corrected OLS coefficient using first split

  • corrected_ols_coef_b: Corrected OLS coefficient using second split

  • corrected_ols_coef: Averaged corrected OLS coefficient from both splits

  • corrected_ols_se: Standard error of corrected OLS coefficient (currently NA)

  • corrected_ols_tstat: T-statistic of corrected OLS coefficient (currently NA)

  • corrected_ols_coef_alt: Alternative corrected OLS coefficient (currently NA)

  • var_est_split: Estimated variance of the measurement error

  • x_est1: First set of latent variable estimates

  • x_est2: Second set of latent variable estimates

Standard Errors

The following standard errors and t-statistics are currently returned as NA because their analytical derivation is not yet implemented:

  • corrected_ols_se: Standard error for the corrected OLS coefficient

  • corrected_ols_tstat: T-statistic for the corrected OLS coefficient

  • corrected_ols_coef_alt: Alternative corrected OLS coefficient

For inference on these quantities, use the bootstrap approach via lpmec, which provides valid confidence intervals and standard errors through resampling.

Examples


# Generate some example data
set.seed(123)
Y <- rnorm(1000)
observables <- as.data.frame(matrix(sample(c(0,1), 1000*10, replace = TRUE), ncol = 10))

# Run the analysis
results <- lpmec_onerun(Y = Y,
                        observables = observables)

# View the corrected estimates
print(results)



lpmec documentation built on Feb. 9, 2026, 5:07 p.m.