pedmod_profile_prop: Computes Profile Likelihood Based Confidence Intervals for...
In pedmod: Pedigree Models

pedmod_profile_prop

R Documentation

Computes Profile Likelihood Based Confidence Intervals for the Proportion of Variance

Description

Constructs a likelihood ratio based confidence intervals for the proportion of variance for one of the effects.

Usage

pedmod_profile_prop(
  ptr,
  par,
  maxvls,
  minvls = -1L,
  alpha = 0.05,
  abs_eps,
  rel_eps,
  which_prof,
  indices = NULL,
  maxvls_start = max(100L, as.integer(ceiling(maxvls/5))),
  minvls_start = if (minvls < 0) minvls else minvls/5,
  do_reorder = TRUE,
  use_aprx = FALSE,
  n_threads = 1L,
  cluster_weights = NULL,
  method = 0L,
  seed = 1L,
  verbose = FALSE,
  max_step = 15L,
  opt_func = NULL,
  use_tilting = FALSE,
  vls_scales = NULL,
  bound = c(0.01, 0.99),
  ...
)

Arguments

`ptr`	object from `pedigree_ll_terms`.
`par`	numeric vector with the maximum likelihood estimator e.g. from `pedmod_opt`.
`maxvls`	maximum number of samples in the approximation for each marginal likelihood term.
`minvls`	minimum number of samples for each marginal likelihood term. Negative values provides a default which depends on the dimension of the integration.
`alpha`	numeric scalar with the confidence level required.
`abs_eps`	absolute convergence threshold for `eval_pedigree_ll` and `eval_pedigree_grad`.
`rel_eps`	rel_eps convergence threshold for `eval_pedigree_ll` and `eval_pedigree_grad`.
`which_prof`	the index of the random effect which proportion of variance should be profiled.
`indices`	zero-based vector with indices of which log marginal likelihood terms to include. Use `NULL` if all indices should be used.
`maxvls_start, minvls_start`	number of samples to use when finding the initial values for the optimization.
`do_reorder`	`TRUE` if a heuristic variable reordering should be used. `TRUE` is likely the best value.
`use_aprx`	`TRUE` if a less precise approximation of `pnorm` and `qnorm` should be used. This may reduce the computation time while not affecting the result much.
`n_threads`	number of threads to use.
`cluster_weights`	numeric vector with weights for each cluster. Use `NULL` if all clusters have weight one.
`method`	integer with the method to use. Zero yields randomized Korobov lattice rules while one yields scrambled Sobol sequences.
`seed`	seed to pass to `set.seed` before each gradient and function evaluation. Use `NULL` if the seed should not be fixed.
`verbose`	logical for whether output should be printed to the console during the estimation of the profile likelihood curve.
`max_step`	integer scalar with the maximum number of steps to take in either directions.
`opt_func`	function to perform minimization with arguments like `optim`. BFGS is used with `optim` if this argument is `NULL`.
`use_tilting`	`TRUE` if the minimax tilting method suggested by Botev (2017) should be used. See doi: 10.1111/rssb.12162.
`vls_scales`	can be a numeric vector with a positive scalar for each cluster. Then `vls_scales[i] * minvls` and `vls_scales[i] * maxvls` is used for cluster `i` rather than `minvls` and `maxvls`. Set `vls_scales = NULL` if the latter should be used.
`bound`	boundaries for the limits of the proportion. Has to be in between (0,1). This is useful particularly if the optimization fails to work on the default values.
`...`	arguments passed to `opt_func`.

Details

The function is only useful when there is more than one type of random effect. If not, then pedmod_profile can be used because of the scale invariance of the likelihood ratio.

Value

A list like pedmod_profile.

Examples


# we simulate outcomes with an additive genetic effect and a childhood
# environment effect. The kinship matrix is the same for all families and
# given by
K <- matrix(c(
  0.5  , 0    , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0    , 0.5  , 0.25 , 0   , 0.25 , 0   , 0.125 , 0.125 , 0.125 , 0.125 ,
  0.25 , 0.25 , 0.5  , 0   , 0.25 , 0   , 0.25  , 0.25  , 0.125 , 0.125 ,
  0    , 0    , 0    , 0.5 , 0    , 0   , 0.25  , 0.25  , 0     , 0     ,
  0.25 , 0.25 , 0.25 , 0   , 0.5  , 0   , 0.125 , 0.125 , 0.25  , 0.25  ,
  0    , 0    , 0    , 0   , 0    , 0.5 , 0     , 0     , 0.25  , 0.25  ,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.5   , 0.25  , 0.0625, 0.0625,
  0.125, 0.125, 0.25 , 0.25, 0.125, 0   , 0.25  , 0.5   , 0.0625, 0.0625,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.5   , 0.25  ,
  0.125, 0.125, 0.125, 0   , 0.25 , 0.25, 0.0625, 0.0625, 0.25  , 0.5
), 10)

# the scale matrix for the childhood environment effect is also the same and
# given by
C <- matrix(c(
  1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 1, 0, 0, 0, 0, 0,
  0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 1, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
  0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
  0, 0, 0, 0, 0, 0, 0, 0, 1, 1
), 10L)

# simulates a data set.
#
# Args:
#   n_fams: number of families.
#   beta: the fixed effect coefficients.
#   sig_sq: the scale parameters.
sim_dat <- function(n_fams, beta = c(-1, 1, 2), sig_sq = c(3, 1)){
  # setup before the simulations
  Cmat <- 2 * K
  n_obs <- NROW(K)
  Sig <- diag(n_obs) + sig_sq[1] * Cmat + sig_sq[2] * C
  Sig_chol <- chol(Sig)

  # simulate the data
  out <- replicate(
    n_fams, {
      # simulate covariates
      X <- cbind(`(Intercept)` = 1, Continuous = rnorm(n_obs),
                 Binary = runif(n_obs) > .5)

      # assign the linear predictor + noise
      eta <- drop(X %*% beta) + drop(rnorm(n_obs) %*% Sig_chol)

      # return the list in the format needed for the package
      list(y = as.numeric(eta > 0), X = X,
           scale_mats = list(genetic = Cmat, environment = C))
    }, simplify = FALSE)

  # add attributes with the true values and return
  attributes(out) <- list(beta = beta, sig_sq = sig_sq)
  out
}

# simulate the data
set.seed(1)
dat <- sim_dat(200L)

# fit the model
ptr <- pedigree_ll_terms(dat, max_threads = 1L)
start <- pedmod_start(ptr = ptr, data = dat, n_threads = 1L)
fit <- pedmod_opt(ptr = ptr, par = start$par, n_threads = 1L, use_aprx = TRUE,
                  maxvls = 5000L, minvls = 1000L, abs_eps = 0, rel_eps = 1e-3)
fit$par # the estimate

# 90% likelihood ratio based confidence interval for the proportion of variance
# of the genetic effect
prof_out <- pedmod_profile_prop(
  ptr = ptr, fit$par, maxvls = 5000L, minvls = 1000L, alpha = .1,
  which_prof = 1L, abs_eps = 0, rel_eps = 1e-3, verbose = TRUE)
prof_out$confs # the confidence interval for the proportion

# plot the log profile likelihood
keep <- c(-1L, -length(prof_out$xs))
plot(prof_out$xs[keep], prof_out$p_log_Lik[keep], pch = 16,
     xlab = "proportion of variance", ylab = "log profile likelihood")
abline(v = prof_out$confs, lty = 2)

pedmod documentation built on Sept. 11, 2022, 5:05 p.m.