iwls_logit: IWLS Solver for Binary Logistic Regression Model

View source: R/iwls_logit.R

iwls_logitR Documentation

IWLS Solver for Binary Logistic Regression Model

Description

Iterative weighted least squares solver for a logistic regression model. Used by foehnix.unreg.fit to estimate the concomitant model of the two-component foehnix mixture models.

Usage

iwls_logit(
  X,
  y,
  beta = NULL,
  lambda = NULL,
  standardize = TRUE,
  maxit = 100L,
  tol = 1e-08,
  verbose = FALSE,
  ...
)

## S3 method for class 'ccmodel'
coef(object, which = "coef", ...)

## S3 method for class 'ccmodel'
logLik(object, ...)

## S3 method for class 'ccmodel'
AIC(object, ...)

## S3 method for class 'ccmodel'
BIC(object, ...)

## S3 method for class 'ccmodel'
edf(object, ...)

## S3 method for class 'ccmodel'
summary(object, ...)

Arguments

X

model matrix including intercept (if required).

y

response, can be binary or probabilities (values in ]0,1[).

beta

initial regression coefficients. If not set (beta = NULL, default), all coefficients will be initialized with 0.

lambda

if set to NULL (default) no penalty is used during optimization. A float can be provided for regularization (ridge/L2 penalty).

standardize

logical. If set to TRUE (default) the model matrix containing the concomitant variables will be standardized.

maxit

integer, maximum number of iterations, default 100.

tol

float, tolerance for the improvement to check for convergence.

verbose

logical, if set to FALSE output is suppressed.

...

currently unused.

object

object of type ccmodel (S3 methods).

which

character, defines whether the coefficients should be returned on the original scale (which = "coef") or the standardized scale (which = "beta"; identical if standardize was FALSE).

Details

Iterative (re-)weighted least squares solver for logistic regression model. The basic call (iwls_solver(X, y)) solves the unregularized problem. Input matrix X is the design matrix containing the concomitant variables for the logistic regression model. Matrix is of dimension N x P where N is the number of of observations, P the number of concomitant variables (including the intercept, if required). If more than one column contains constant values the script will throw an error (solution no more identifiable). y is the binary response vector of length N containing 0s and 1s.

beta can be used to specify the initial regression parameters. If not set (beta = NULL; default) all parameters will be initialized with 0s.

If lambda is set (float) a ridge penalty will be added to shrink the regression parameters.

The logical option standardize controls whether or not the model matrix (covariates) should be standardized using Gaussian standardization ((x - mean(x)) / sd(x)) for all columns with non-constant data. It is recommended to use standardization (standardize = TRUE) to avoid numerical problems.

maxit and tol allow to control the iterations of the IWLS solver. maxit is the maximum number of iterations allowed. tol is used to check the log-likelihood improvements. If the improvements compared with the previous iteration falls below this tolerance the optimizer converged. If maxit is reached the solver will stop, even if not converged.

Value

Returns an object of class ccmodel (concomitant model). The object contains the following information:

  • lambda value used (or NULL).

  • edf effective degrees of freedom. Equal to P (ncol(X)) if no regularization is used.

  • loglik final log-likelihood sum of the model.

  • AIC Akaike information criteria.

  • BIC Bayesian information criteria.

  • converged logical flag whether or not the algorithm converged (see maxit, tol).

  • beta matrix of dimension P x 1 containing the estimated and possibly standardized regression coefficients (see input standardize). If input standardize = FALSE beta == coef.

  • coef matrix of dimension P x 1 containing the destandardized regression coefficients.

Author(s)

Reto Stauffer

See Also

destandardize_coefficients, standardize, is.standardized

Examples

# Example data set
data("airquality")
airquality <- na.omit(airquality)
airquality$Ozone <- as.numeric(airquality$Ozone > 50)

# glm model
m1 <- glm(Ozone ~ ., data = airquality, family = binomial(link = "logit"))

# Setting up model.frame, response, and model matrix
mf <- model.frame(Ozone ~ ., data = airquality)
X  <- model.matrix(Ozone ~ ., data = airquality)
y  <- model.response(mf)

# Default call
m2 <- iwls_logit(X, y)
# With standardized coefficients
m3 <- iwls_logit(X, y, standardize = TRUE)
# No early stop, stop when maxit = 100 is reached. Will through
m4 <- iwls_logit(X, y, standardize = TRUE, tol = -Inf, maxit = 100)

# Comparing coefficients
print(cbind(coef(m1), m2$coef, m3$coef, m4$coef))


retostauffer/Rfoehnix documentation built on June 5, 2023, 11:39 p.m.