LDpred2: LDpred2

snp_ldpred2_infR Documentation



LDpred2. Tutorial at https://privefl.github.io/bigsnpr/articles/LDpred2.html.


snp_ldpred2_inf(corr, df_beta, h2)

  burn_in = 50,
  num_iter = 100,
  ncores = 1,
  return_sampling_betas = FALSE,
  ind.corr = cols_along(corr)

  vec_p_init = 0.1,
  burn_in = 500,
  num_iter = 200,
  sparse = FALSE,
  verbose = FALSE,
  report_step = num_iter + 1L,
  allow_jump_sign = TRUE,
  shrink_corr = 1,
  use_MLE = TRUE,
  alpha_bounds = c(-1.5, 0.5),
  ind.corr = cols_along(corr),
  ncores = 1



Sparse correlation matrix as an SFBM. If corr is a dsCMatrix or a dgCMatrix, you can use as_SFBM(corr).


A data frame with 3 columns:

  • ⁠$beta⁠: effect size estimates

  • ⁠$beta_se⁠: standard errors of effect size estimates

  • ⁠$n_eff⁠: sample size when estimating beta (in the case of binary traits, this is 4 / (1 / n_control + 1 / n_case))


Heritability estimate.


A data frame with 3 columns as a grid of hyper-parameters:

  • ⁠$p⁠: proportion of causal variants

  • ⁠$h2⁠: heritability (captured by the variants used)

  • ⁠$sparse⁠: boolean, whether a sparse model is sought They can be run in parallel by changing ncores.


Number of burn-in iterations.


Number of iterations after burn-in.


Number of cores used. Default doesn't use parallelism. You may use nb_cores.


Whether to return all sampling betas (after burn-in)? This is useful for assessing the uncertainty of the PRS at the individual level (see \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1101/2020.11.30.403188")}). Default is FALSE (only returns the averaged final vectors of betas). If TRUE, only one set of parameters is allowed.


Indices to "subset" corr, as if this was run with corr[ind.corr, ind.corr] instead. No subsetting by default.


Heritability estimate for initialization.


Vector of initial values for p. Default is 0.1.


In LDpred2-auto, whether to also report a sparse solution by running LDpred2-grid with the estimates of p and h2 from LDpred2-auto, and sparsity enabled. Default is FALSE.


Whether to print "p // h2" estimates at each iteration. Disabled when parallelism is used.


Step to report sampling betas (after burn-in and before unscaling). Nothing is reported by default. If using num_iter = 200 and report_step = 20, then 10 vectors of sampling betas are reported (as a sparse matrix with 10 columns).


Whether to allow for effects sizes to change sign in consecutive iterations? Default is TRUE (normal sampling). You can use FALSE to force effects to go through 0 first before changing sign. Setting this parameter to FALSE could be useful to prevent instability (oscillation and ultimately divergence) of the Gibbs sampler. This would also be useful for accelerating convergence of chains with a large initial value for p.


Shrinkage multiplicative coefficient to apply to off-diagonal elements of the correlation matrix. Default is 1 (unchanged). You can use e.g. 0.95 to add a bit of regularization.


Whether to use maximum likelihood estimation (MLE) to estimate alpha and the variance component (since v1.11.4), or assume that alpha is -1 and estimate the variance of (scaled) effects as h2/(m*p), as it was done in earlier versions of LDpred2-auto (e.g. in v1.10.8). Default is TRUE, which should provide a better model fit, but might also be less robust.


Boundaries for the estimates of \alpha. Default is c(-1.5, 0.5). You can use the same value twice to fix \alpha.


For reproducibility, set.seed() can be used to ensure that two runs of LDpred2 give the exact same results (since v1.10).


snp_ldpred2_inf: A vector of effects, assuming an infinitesimal model.

snp_ldpred2_grid: A matrix of effect sizes, one vector (column) for each row of grid_param. Missing values are returned when strong divergence is detected. If using return_sampling_betas, each column corresponds to one iteration instead (after burn-in).

snp_ldpred2_auto: A list (over vec_p_init) of lists with

  • ⁠$beta_est⁠: vector of effect sizes (on the allele scale); note that missing values are returned when strong divergence is detected

  • ⁠$beta_est_sparse⁠ (only when sparse = TRUE): sparse vector of effect sizes

  • ⁠$postp_est⁠: vector of posterior probabilities of being causal

  • ⁠$corr_est⁠, the "imputed" correlations between variants and phenotypes, which can be used for post-QCing variants by comparing those to with(df_beta, beta / sqrt(n_eff * beta_se^2 + beta^2))

  • ⁠$sample_beta⁠: sparse matrix of sampling betas (see parameter report_step), not on the allele scale, for which you need to multiply by with(df_beta, sqrt(n_eff * beta_se^2 + beta^2))

  • ⁠$path_p_est⁠: full path of p estimates (including burn-in); useful to check convergence of the iterative algorithm

  • ⁠$path_h2_est⁠: full path of h2 estimates (including burn-in); useful to check convergence of the iterative algorithm

  • ⁠$path_alpha_est⁠: full path of alpha estimates (including burn-in); useful to check convergence of the iterative algorithm

  • ⁠$h2_est⁠: estimate of the (SNP) heritability (also see coef_to_liab)

  • ⁠$p_est⁠: estimate of p, the proportion of causal variants

  • ⁠$alpha_est⁠: estimate of alpha, the parameter controlling the relationship between allele frequencies and expected effect sizes

  • ⁠$h2_init⁠ and ⁠$p_init⁠: input parameters, for convenience

bigsnpr documentation built on March 31, 2023, 10:37 p.m.