equation_binary: Setup a system of equations

View source: R/equation_binary.R

equation_binaryR Documentation

Setup a system of equations

Description

This function sets up a system of four equations for binary regressions. The solution to the system characterizes the asymptotic bias and variance of the M-estimator, as well as the Hessian of the loss function (in case of the MLE, the loss function is the negative log-likelihood).

Usage

equation_binary(
  rho_prime,
  f_prime1,
  f_prime0,
  kappa,
  gamma,
  beta0,
  intercept = TRUE
)

Arguments

rho_prime

A function that computes the success probability \rho'(t) = \mathrm{P}(Y=1 | X^\top \beta = t), here \beta is the coefficient. The default is logistic model.

f_prime1

A function. Derivative of the loss function when Y = 1. The default is the derivative of the negative log-likelihood of logistic regression when Y = 1.

f_prime0

A function. Derivative of the loss function when Y = -1. The default is the derivative of the negative log-likelihood of logistic regression when Y = -1.

kappa

Numeric. The problem dimension \kappa = p/n.

gamma

Numeric. Signal strength \gamma = \sqrt{\mathrm{Var}(X^\top \beta)}.

beta0

Numeric. Intercept.

intercept

If TRUE, the glm contains an intercept term. intercept = TRUE by default.

Details

Following is the formula of the four equations:

\begin{dcases} \sigma^2\kappa^2 & =\E{\rho'(S_1)(\lambda\rho'(\mathrm{prox}_{\lambda\rho}(-S_2)))^2+\rho'(-S_1)(\lambda\rho'(\mathrm{prox}_{\lambda\rho}(S_2)))^2}\\ \sigma \sqrt{\kappa}(1-\kappa) & =\E{\rho'(S_1)Z_2\mathrm{prox}_{\lambda\rho}(\lambda+S_2) + \rho'(-S_1)Z_2\mathrm{prox}_{\lambda\rho}(S_2)}\\ \gamma_0 \alpha & = \E{\rho'(S_1)Z_1\mathrm{prox}_{\lambda\rho}(\lambda+S_2) + \rho'(-S_1)Z_1\mathrm{prox}_{\lambda\rho}(S_2)}, 0 & = \E{-\rho'(S_1)\rho'(\mathrm{prox}_{\lambda\rho}(-S_2)) + \rho'(-S_1)\rho'(\mathrm{prox}_{\lambda\rho}(S_2))}. \end{dcases}

where (Z_1, Z_2)\sim\mathcal{N}(0, I_2) and

S_1 = \gamma_0 Z_1 + \beta_0 ,\quad S_2 = \alpha \gamma_0 Z_1 + \sigma\sqrt{\kappa} Z_2 + b_0,

When the variables does not have an intercept term, then b_0 = 0. If the model does not have an intercept, then \beta_0 = 0.

Value

A function that takes as input the parameters (\alpha,\lambda,\sigma,b) and returns a vector of length 4, which is the value of the four equations. When the model contains no intercept term (intercept = FALSE), returns a system of three equations. The special case when there is no signal (gamma = 0) or intercept, returns a system of two equations.

References

A modern maximum-likelihood theory for high-dimensional logistic regression, Pragya Sur and Emmanuel J. Candes, Proceedings of the National Academy of Sciences Jul 2019, 116 (29) 14516-14525


zq00/glmhd documentation built on April 7, 2023, 7:45 a.m.