logreg2ph: Sieve maximum likelihood estimator (SMLE) for two-phase...

View source: R/logreg2ph.R

logreg2phR Documentation

Sieve maximum likelihood estimator (SMLE) for two-phase logistic regression problems This function returns the sieve maximum likelihood estimators (SMLE) for the logistic regression model from Lotspeich et al. (2021)

Description

Sieve maximum likelihood estimator (SMLE) for two-phase logistic regression problems This function returns the sieve maximum likelihood estimators (SMLE) for the logistic regression model from Lotspeich et al. (2021)

Usage

logreg2ph(
  Y_unval = NULL,
  Y_val = NULL,
  X_unval = NULL,
  X_val = NULL,
  C = NULL,
  Validated = NULL,
  Bspline = NULL,
  data,
  theta_pred = NULL,
  gamma_pred = NULL,
  initial_lr_params = "Zero",
  h_N_scale = 1,
  noSE = FALSE,
  TOL = 1e-04,
  MAX_ITER = 1000
)

Arguments

Y_unval

Column names with the unvalidated outcome. If Y_unval is null, the outcome is assumed to be error-free.

Y_val

Column names with the validated outcome.

X_unval

Column name(s) with the unvalidated predictors. If X_unval and X_val are null, all precictors are assumed to be error-free.

X_val

Column name(s) with the validated predictors. If X_unval and X_val are null, all precictors are assumed to be error-free.

C

(Optional) Column name(s) with additional error-free covariates.

Validated

Column name with the validation indicator. The validation indicator can be defined as Validated = 1 or TRUE if the subject was validated and Validated = 0 or FALSE otherwise.

Bspline

Vector of column names containing the B-spline basis functions.

data

A dataframe with one row per subject containing columns: Y_unval, Y_val, X_unval, X_val, C, Validated, and Bspline.

theta_pred

Vector of columns in data that pertain to the predictors in the analysis model.

gamma_pred

Vector of columns in data that pertain to the predictors in the outcome error model.

initial_lr_params

Initial values for parametric model parameters. Choices include (1) "Zero" (non-informative starting values) or (2) "Complete-data" (estimated based on validated subjects only)

h_N_scale

Size of the perturbation used in estimating the standard errors via profile likelihood. If none is supplied, default is h_N_scale = 1.

noSE

Indicator for whether standard errors are desired. Defaults to noSE = FALSE.

TOL

Tolerance between iterations in the EM algorithm used to define convergence.

MAX_ITER

Maximum number of iterations allowed in the EM algorithm.

Value

model_coeff

dataframe with final model coefficients and standard error estimates (where applicable) for the analysis model.

outcome_error_coeff

dataframe with final model coefficients for the outcome error model.

bspline_coeff

dataframe with B-spline coefficients for the covariate error model.

converged

indicator of EM algorithm convergence for parameter estimates.

se_converged

indicator of standard error estimate convergence.

converged_msg

(where applicable) description of non-convergence.

iterations

number of iterations completed by EM algorithm to find parameter estimates.


sarahlotspeich/logreg2ph_R_only documentation built on Jan. 20, 2025, 6:20 p.m.