psqn_hess | R Documentation |
Computes the Hessian using numerical differentiation with Richardson extrapolation.
psqn_hess( val, fn, n_ele_func, n_threads = 1L, env = NULL, eps = 0.001, scale = 2, tol = 1e-09, order = 6L ) psqn_generic_hess( val, fn, n_ele_func, n_threads = 1L, env = NULL, eps = 0.001, scale = 2, tol = 1e-09, order = 6L )
val |
Where to evaluate the function at. |
fn |
Function to compute the element functions and their derivatives.
See |
n_ele_func |
Number of element functions. |
n_threads |
Number of threads to use. |
env |
Environment to evaluate |
eps |
Determines the step size. See the details. |
scale |
Scaling factor in the Richardson extrapolation. See the details. |
tol |
Relative convergence criteria. See the details. |
order |
Maximum number of iteration of the Richardson extrapolation. |
The function computes the Hessian using numerical differentiation with centered differences and subsequent use of Richardson extrapolation to refine the estimate.
The additional arguments are as follows: The numerical differentiation
is applied for each argument with a step size of
s = max(eps, |x| * eps)
.
The Richardson extrapolation at iteration i
uses a step size of
s * scale^(-i)
. The convergence threshold for each comportment of
the gradient is max(tol, |gr(x)[j]| * tol)
.
The numerical differentiation is done on each element function and thus much more efficient then doing it on the whole gradient.
# assign model parameters, number of random effects, and fixed effects q <- 2 # number of private parameters per cluster p <- 1 # number of global parameters beta <- sqrt((1:p) / sum(1:p)) Sigma <- diag(q) # simulate a data set set.seed(66608927) n_clusters <- 20L # number of clusters sim_dat <- replicate(n_clusters, { n_members <- sample.int(8L, 1L) + 2L X <- matrix(runif(p * n_members, -sqrt(6 / 2), sqrt(6 / 2)), p) u <- drop(rnorm(q) %*% chol(Sigma)) Z <- matrix(runif(q * n_members, -sqrt(6 / 2 / q), sqrt(6 / 2 / q)), q) eta <- drop(beta %*% X + u %*% Z) y <- as.numeric((1 + exp(-eta))^(-1) > runif(n_members)) list(X = X, Z = Z, y = y, u = u, Sigma_inv = solve(Sigma)) }, simplify = FALSE) # evaluates the negative log integrand. # # Args: # i cluster/element function index. # par the global and private parameter for this cluster. It has length # zero if the number of parameters is requested. That is, a 2D integer # vector the number of global parameters as the first element and the # number of private parameters as the second element. # comp_grad logical for whether to compute the gradient. r_func <- function(i, par, comp_grad){ dat <- sim_dat[[i]] X <- dat$X Z <- dat$Z if(length(par) < 1) # requested the dimension of the parameter return(c(global_dim = NROW(dat$X), private_dim = NROW(dat$Z))) y <- dat$y Sigma_inv <- dat$Sigma_inv beta <- par[1:p] uhat <- par[1:q + p] eta <- drop(beta %*% X + uhat %*% Z) exp_eta <- exp(eta) out <- -sum(y * eta) + sum(log(1 + exp_eta)) + sum(uhat * (Sigma_inv %*% uhat)) / 2 if(comp_grad){ d_eta <- -y + exp_eta / (1 + exp_eta) grad <- c(X %*% d_eta, Z %*% d_eta + dat$Sigma_inv %*% uhat) attr(out, "grad") <- grad } out } # compute the hessian set.seed(1) par <- runif(p + q * n_clusters, -1) hess <- psqn_hess(val = par, fn = r_func, n_ele_func = n_clusters) # compare with numerical differentiation from R if(require(numDeriv)){ hess_num <- jacobian(function(x){ out <- numeric(length(x)) for(i in seq_len(n_clusters)){ out_i <- r_func(i, x[c(1:p, 1:q + (i - 1L) * q + p)], TRUE) out[1:p] <- out[1:p] + attr(out_i, "grad")[1:p] out[1:q + (i - 1L) * q + p] <- attr(out_i, "grad")[1:q + p] } out }, par) cat("Output of all.equal\n") print(all.equal(Matrix(hess_num, sparse = TRUE), hess)) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.