summary.mdyplFit: Summary method for '"mdyplFit"' objects
In brglm2: Bias Reduction in Generalized Linear Models

summary.mdyplFit

R Documentation

Summary method for `"mdyplFit"` objects

Description

Summary method for "mdyplFit" objects

Usage

## S3 method for class 'mdyplFit'
summary(
  object,
  hd_correction = FALSE,
  se_start,
  gh = NULL,
  prox_tol = 1e-10,
  transform = TRUE,
  init_iter = 50,
  init_method = "Nelder-Mead",
  ...
)

## S3 method for class 'summary.mdyplFit'
print(
  x,
  digits = max(3L, getOption("digits") - 3L),
  symbolic.cor = x$symbolic.cor,
  signif.stars = getOption("show.signif.stars"),
  ...
)

Arguments

`object`	an object of class `"glm"`, usually, a result of a call to `glm`.
`hd_correction`	if `FALSE` (default), then the summary corresponding to standard asymptotics is computed. If `TRUE` then the high-dimensionality corrections in Sterzinger & Kosmidis (2024) are employed to updates estimates, estimated standard errors, z-statistics. See Details.
`se_start`	a vector of starting values for the state evolution equations. See the `start` argument in `solve_se()`.
`gh`	A list with the Gauss-Hermite quadrature nodes and weights, as returned from `statmod::gauss.quad()` with `kind = "hermite"`. Default is `NULL`, in which case `gh` is set to `statmod::gauss.quad(200, kind = "hermite")`.
`prox_tol`	tolerance for the computation of the proximal operator; default is `1e-10`.
`transform`	if `TRUE` (default), the optimization is with respect to `log(mu)`, `log(b)`,`log(sigma)`, (and `iota` if `intercept` is numeric). If `FALSE`, then it is over `mu`, `b`, `sigma` (and `iota` if `intercept` is numeric). The solution is returned in terms of the latter set, regardless of how optimization took place.
`init_iter`	how many iterations of `optim()` should we make to get starting values for `nleqslv::nleqslv()`? Default is `50`, but can also be `0` in which case `start` is directly passed to `nleqslv:nleqslv()`. `init_iter = "only"` results in only `optim()` being used. See Details.
`init_method`	The method to be passed to `optim()`. Default is `"Nelder-Mead"`.
`...`	further arguments to be passed to `summary.glm()`.
`x`	an object of class `"summary.glm"`, usually, a result of a call to `summary.glm`.
`digits`	the number of significant digits to use when printing.
`symbolic.cor`	logical. If `TRUE`, print the correlations in a symbolic form (see `symnum`) rather than as numbers.
`signif.stars`	logical. If `TRUE`, ‘significance stars’ are printed for each coefficient.

Details

If hd_correction = TRUE, the sloe() estimator of the square root of the corrupted signal strength is estimated from object, as are the conditional variances of each covariate given the others (excluding the intercept). The latter are estimated using residual sums of squares from the linear regression of each covariate on all the others, as proposed in Zhao et al (2021, Section 5.1). Then the appropriate state evolution equations are solved using solve_se() with corrupted = TRUE, and the obtained constants are used to rescale the estimates, and adjust estimated standard errors and z-statistics as in Sterzinger & Kosmidis (2024).

The key assumptions under which the rescaled estimates and corrected standard errors and z-statistics are asymptotically valid are that the covariates have sub-Gaussian distributions, and that the signal strength, which is the limit

\gamma^2

of var(X \beta) is finite as p / n \to \kappa \in (0, 1), with \kappa \in (0, 1). See Sterzinger & Kosmidis (2024).

If hd_correction = TRUE, and the model has an intercept, then the result provides only a corrected estimate of the intercept with no accompanying standard error, z-statistic, and p-value. Also, vcov(summary(object, hd_correction = TRUE)) is always NULL. Populating those objects with appropriate estimates is the subject of current work.

Value

A list with objects as in the result of stats::summary.glm(), with extra component se_parameters, which is the vector of the solution to the state evolution equations with extra attributes (see solve_se()).

Author(s)

Ioannis Kosmidis ⁠[aut, cre]⁠ ioannis.kosmidis@warwick.ac.uk

References

Zhao Q, Sur P, Cand\'es E J (2022). The asymptotic distribution of the MLE in high-dimensional logistic models: Arbitrary covariance. Bernoulli, 28, 1835–1861. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3150/21-BEJ1401")}.

Sterzinger P, Kosmidis I (2024). Diaconis-Ylvisaker prior penalized likelihood for p/n \to \kappa \in (0,1) logistic regression. arXiv:2311.07419v2, https://arxiv.org/abs/2311.07419.

Examples


## Not run: 

set.seed(123)
n <- 2000
p <- 400
set.seed(123)
betas <- c(rnorm(p / 2, mean = 7, sd = 1), rep(0, p / 2))
X <- matrix(rnorm(n * p, 0, 1/sqrt(n)), nrow = n, ncol = p)
probs <- plogis(drop(X %*% betas))
y <- rbinom(n, 1, probs)
fit_mdypl <- glm(y ~ -1 + X, family = binomial(), method = "mdyplFit")

st_summary <- summary(fit_mdypl)
hd_summary <- summary(fit_mdypl, hd_correction = TRUE)

cols <- hcl.colors(3, alpha = 0.2)
par(mfrow = c(1, 2))
plot(betas, type = "l", ylim = c(-3, 14),
     main = "MDYPL estimates",
     xlab = "Parameter index", ylab = NA)
points(coef(st_summary)[, "Estimate"], col = NA, bg = cols[1], pch = 21)

plot(betas, type = "l", ylim = c(-3, 14),
     main = "rescaled MDYPL estimates",
     xlab = "Parameter index", ylab = NA)
points(coef(hd_summary)[, "Estimate"], col = NA, bg = cols[2], pch = 21)

## z-statistics
z_mdypl <- coef(st_summary)[betas == 0, "z value"]
qqnorm(z_mdypl, col = NA, bg = cols[1], pch = 21, main = "z value")
abline(0, 1, lty = 2)
z_c_mdypl <- coef(hd_summary)[betas == 0, "z value"]
qqnorm(z_c_mdypl, col = NA, bg = cols[2], pch = 21, main = "corrected z value")
abline(0, 1, lty = 2)


## End(Not run)

brglm2 documentation built on Aug. 29, 2025, 5:25 p.m.