knitr::opts_chunk$set(echo = TRUE, eval = F) library(tidyverse) if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") library(devtools) if (!requireNamespace("BhGLM", quietly = TRUE)) install_github("boyiguo1/BhGLM@spline") library(BhGLM) library(BHAM)
When I was studying for the problem of estimating effect degree of freedom, I dug down to the rabit hole and tried to understand more about the /regularization/shrinkage/penalty from a Bayesian hierarchical model, specifically spike-and-slab mixture normal distribution.
To my understanding, we can write the mixture normal prior to a $l_2$ norm assuming the density (bernoulli($\pi$)) of the indicator variable and the scale parameter (s_0, s_1) are given. Specifically,
$$
\lambda_{l_2} = \text{[TODO: add equation]},
$$
where the regularization parameter $\lambda$ is in an inverse relationship with the scale parameter of (mixture) normal distribution. In the bglm function, the returning vector prior.scale
should match to this penalty parameter. The function returns the regularization parameter $\lambda$ as well as prior.scale
as an starting value for Bayesian hierarchical counterparts.
N = 1000 K = 100 x = sim.x(n=N, m=K, corr=0.6) # simulate correlated continuous variables h = rep(0.1, 4) # assign four non-zero main effects to have the assumed heritabilty nz = as.integer(seq(5, K, by=K/length(h))); nz yy = sim.y(x=x[, nz], mu=0, herit=h, p.neg=0.5, sigma=1.6) # simulate responses yy$coefs y = yy$y.normal; fam = "gaussian"; y = scale(y) # y = yy$y.ordinal; fam = "binomial" # y = yy$y.surv; fam = "cox" f1 = glmNet(x, y, family = fam, ncv = 1) c(f1$lambda, f1$prior.scale)
K = 1000 x = sim.x(n=N, m=K, corr=0.6) # simulate correlated continuous variables h = rep(0.1, 4) # assign four non-zero main effects to have the assumed heritabilty nz = as.integer(seq(5, K, by=K/length(h))); nz yy = sim.y(x=x[, nz], mu=0, herit=h, p.neg=0.5, sigma=1.6) # simulate responses yy$coefs y = yy$y.normal; fam = "gaussian"; y = scale(y) # y = yy$y.ordinal; fam = "binomial" # y = yy$y.surv; fam = "cox" f2 = glmNet(x, y, family = fam, ncv = 1) c(f2$lambda, f2$prior.scale)
Via the example, we can see that lambda and prior.scale increase and decrease respectivelly when the number of predictors increase from 100 to 1000 while the number of effective predictors remains the same.
To further explain prior.scale
, when the effect is small, the prior.scale is small, and hence the shrinkage impose on the variable are large. On the contrary, prior.scale is large when the effect is large, and hence the shrinkage is small.
bglm
In this part, I would like to analyze if grouping each individual predictors will change the prediction. I am hypothesizing it will not change by adding the grouping arguments
N = 1000 K = 100 x = sim.x(n=N, m=K, corr=0.6) # simulate correlated continuous variables h = rep(0.1, 4) # assign four non-zero main effects to have the assumed heritabilty nz = as.integer(seq(5, K, by=K/length(h))); nz yy = sim.y(x=x[, nz], mu=0, herit=h, p.neg=0.5, sigma=1.6) # simulate responses yy$coefs y = yy$y.normal; fam = "gaussian"; y = scale(y) # y = yy$y.ordinal; fam = "binomial" # y = yy$y.surv; fam = "cox" f2 <- bglm(y ~ ., data= x, family = fam, prior = mt(df=Inf)) f3 <- bglm(y ~ ., data= x, family = fam, prior = mt(df=Inf), group = 1)
head(coef(f2)) head(coef(f3))
head(data.frame(f2$p, f3$p))
head(data.frame(f2$ptheta, f3$ptheta))
head(data.frame(f2$prior.scale, f3$prior.scale))
head(data.frame(f2$prior.sd, f3$prior.sd))
In this scenario, I would like to study the behaviour when some of the covariates are included in the model, but not included in a group argument. The main quesiton of mine is that if these covariates of adjustment will be penalized or not.
N = 1000 K = 100 x = sim.x(n=N, m=K, corr=0.6) # simulate correlated continuous variables h = rep(0.1, 4) # assign four non-zero main effects to have the assumed heritabilty nz = as.integer(seq(5, K, by=K/length(h))); nz yy = sim.y(x=x[, nz], mu=0, herit=h, p.neg=0.5, sigma=1.6) # simulate responses yy$coefs y = yy$y.normal; fam = "gaussian"; y = scale(y) # y = yy$y.ordinal; fam = "binomial" # y = yy$y.surv; fam = "cox" ## Screening # feature_name <- # gam_screening_res <- x %>% # names() %>% # map_dfr(#feature_name[1:2], # .f = function(name, y, .dat){ # # name <- feature_name[1] # # y <- .dat %>% pull(death3yr)D # x <- .dat %>% pull({{name}}) # mgcv::gam(y ~ s(x, bs = "cr", k = 10), family = gaussian(), # data = data.frame(x, y)) %>% # broom::tidy() %>% # filter(term == "s(x)") %>% # select(p.value) %>% # mutate(var = name, # p.value) # # }, # y = y, # .dat = x) # # data.frame( # gam_screening_res, # p.adj = p.adjust(gam_screening_res$p.value, "BH") # ) %>% arrange(p.adj) f2 <- bglm(y ~ ., data= x, family = fam, prior = mt(df=Inf)) f3 <- bglm(y ~ ., data= x, family = fam, prior = mt(df=Inf), group = purrr::map(5:K, ~.x))
What is wrong in this case? Why the warning messages?
head(data.frame(f2$p, f3$p))
head(data.frame(f2$ptheta, f3$ptheta))
head(data.frame(f2$prior.scale, f3$prior.scale))
head(data.frame(f2$prior.sd, f3$prior.sd))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.