pedmod_profile_prop: Computes Profile Likelihood Based Confidence Intervals for...

pedmod_profile_propR Documentation

Computes Profile Likelihood Based Confidence Intervals for the Proportion of Variance

Description

Constructs a likelihood ratio based confidence intervals for the proportion of variance for one of the effects.

Usage

pedmod_profile_prop(
  ptr,
  par,
  maxvls,
  minvls = -1L,
  alpha = 0.05,
  abs_eps,
  rel_eps,
  which_prof,
  indices = NULL,
  maxvls_start = max(100L, as.integer(ceiling(maxvls/5))),
  minvls_start = if (minvls < 0) minvls else minvls/5,
  do_reorder = TRUE,
  use_aprx = FALSE,
  n_threads = 1L,
  cluster_weights = NULL,
  method = 0L,
  seed = 1L,
  verbose = FALSE,
  max_step = 15L,
  opt_func = NULL,
  use_tilting = FALSE,
  vls_scales = NULL,
  bound = c(0.01, 0.99),
  ...
)

Arguments

ptr

object from pedigree_ll_terms.

par

numeric vector with the maximum likelihood estimator e.g. from pedmod_opt.

maxvls

maximum number of samples in the approximation for each marginal likelihood term.

minvls

minimum number of samples for each marginal likelihood term. Negative values provides a default which depends on the dimension of the integration.

alpha

numeric scalar with the confidence level required.

abs_eps

absolute convergence threshold for eval_pedigree_ll and eval_pedigree_grad.

rel_eps

rel_eps convergence threshold for eval_pedigree_ll and eval_pedigree_grad.

which_prof

the index of the random effect which proportion of variance should be profiled.

indices

zero-based vector with indices of which log marginal likelihood terms to include. Use NULL if all indices should be used.

maxvls_start, minvls_start

number of samples to use when finding the initial values for the optimization.

do_reorder

TRUE if a heuristic variable reordering should be used. TRUE is likely the best value.

use_aprx

TRUE if a less precise approximation of pnorm and qnorm should be used. This may reduce the computation time while not affecting the result much.

n_threads

number of threads to use.

cluster_weights

numeric vector with weights for each cluster. Use NULL if all clusters have weight one.

method

integer with the method to use. Zero yields randomized Korobov lattice rules while one yields scrambled Sobol sequences.

seed

seed to pass to set.seed before each gradient and function evaluation. Use NULL if the seed should not be fixed.

verbose

logical for whether output should be printed to the console during the estimation of the profile likelihood curve.

max_step

integer scalar with the maximum number of steps to take in either directions.

opt_func

function to perform minimization with arguments like optim. BFGS is used with optim if this argument is NULL.

use_tilting

TRUE if the minimax tilting method suggested by Botev (2017) should be used. See doi: 10.1111/rssb.12162.

vls_scales

can be a numeric vector with a positive scalar for each cluster. Then vls_scales[i] * minvls and vls_scales[i] * maxvls is used for cluster i rather than minvls and maxvls. Set vls_scales = NULL if the latter should be used.

bound

boundaries for the limits of the proportion. Has to be in between (0,1). This is useful particularly if the optimization fails to work on the default values.

...

arguments passed to opt_func.

Details

The function is only useful when there is more than one type of random effect. If not, then pedmod_profile can be used because of the scale invariance of the likelihood ratio.

Value

A list like pedmod_profile.

See Also

pedmod_opt, pedmod_sqn, pedmod_profile, and pedmod_profile_nleq.

Examples


# we simulate outcomes with an additive genetic effect and a childhood
# environment effect. The kinship matrix is the same for all families and
# given by
K <- matrix(c(
  0.5  , 0    , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0    , 0.5  , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0.25 , 0.25 , 0.5  , 0   , 0.25 , 0   , 0.25  , 0.25  , 0.125 , 0.125 ,
  0    , 0    , 0    , 0.5 , 0    , 0   , 0.25  , 0.25  , 0     , 0     ,
  0.25 , 0.25 , 0.25 , 0   , 0.5  , 0   , 0.125 , 0.125 , 0.25  , 0.25  ,
  0    , 0    , 0    , 0   , 0    , 0.5 , 0     , 0     , 0.25  , 0.25  ,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.5   , 0.25  , 0.0625, 0.0625,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.25  , 0.5   , 0.0625, 0.0625,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.5   , 0.25  ,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.25  , 0.5
), 10)

# the scale matrix for the childhood environment effect is also the same and
# given by
C <- matrix(c(
  1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 1, 0, 0, 0, 0, 0,
  0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 1, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
  0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
  0, 0, 0, 0, 0, 0, 0, 0, 1, 1
), 10L)

# simulates a data set.
#
# Args:
#   n_fams: number of families.
#   beta: the fixed effect coefficients.
#   sig_sq: the scale parameters.
sim_dat <- function(n_fams, beta = c(-1, 1, 2), sig_sq = c(3, 1)){
  # setup before the simulations
  Cmat <- 2 * K
  n_obs <- NROW(K)
  Sig <- diag(n_obs) + sig_sq[1] * Cmat + sig_sq[2] * C
  Sig_chol <- chol(Sig)

  # simulate the data
  out <- replicate(
    n_fams, {
      # simulate covariates
      X <- cbind(`(Intercept)` = 1, Continuous = rnorm(n_obs),
                 Binary = runif(n_obs) > .5)

      # assign the linear predictor + noise
      eta <- drop(X %*% beta) + drop(rnorm(n_obs) %*% Sig_chol)

      # return the list in the format needed for the package
      list(y = as.numeric(eta > 0), X = X,
           scale_mats = list(genetic = Cmat, environment = C))
    }, simplify = FALSE)

  # add attributes with the true values and return
  attributes(out) <- list(beta = beta, sig_sq = sig_sq)
  out
}

# simulate the data
set.seed(1)
dat <- sim_dat(200L)

# fit the model
ptr <- pedigree_ll_terms(dat, max_threads = 1L)
start <- pedmod_start(ptr = ptr, data = dat, n_threads = 1L)
fit <- pedmod_opt(ptr = ptr, par = start$par, n_threads = 1L, use_aprx = TRUE,
                  maxvls = 5000L, minvls = 1000L, abs_eps = 0, rel_eps = 1e-3)
fit$par # the estimate

# 90% likelihood ratio based confidence interval for the proportion of variance
# of the genetic effect
prof_out <- pedmod_profile_prop(
  ptr = ptr, fit$par, maxvls = 5000L, minvls = 1000L, alpha = .1,
  which_prof = 1L, abs_eps = 0, rel_eps = 1e-3, verbose = TRUE)
prof_out$confs # the confidence interval for the proportion

# plot the log profile likelihood
keep <- c(-1L, -length(prof_out$xs))
plot(prof_out$xs[keep], prof_out$p_log_Lik[keep], pch = 16,
     xlab = "proportion of variance", ylab = "log profile likelihood")
abline(v = prof_out$confs, lty = 2)



pedmod documentation built on Sept. 11, 2022, 5:05 p.m.