cpbayes_cor: Run correlated version of CPBayes.

Description Usage Arguments Value References See Also Examples

View source: R/FuncUserCor.R


Run correlated version of CPBayes when the main genetic effect (beta/log(odds ratio)) estimates across studies/traits are correlated.


cpbayes_cor(BetaHat, SE, Corln, Phenotypes, Variant, UpdateSlabVar = TRUE,
  MinSlabVar = 0.6, MaxSlabVar = 1, MCMCiter = 20000, Burnin = 5000)



A numeric vector of length K where K is the number of phenotypes. It contains the beta-hat values across studies/traits. No default is specified.


A numeric vector with the same dimension as BetaHat providing the standard errors corresponding to BetaHat. Every element of SE must be positive. No default is specified.


A numeric square matrix of order K by K providing the correlation matrix of BetaHat. The number of rows & columns of Corln must be the same as the length of BetaHat. No default is specified. See estimate_corln.


A character vector of the same length as BetaHat providing the name of the phenotypes. Default is specified as trait1, trait2, . . . , traitK. Note that BetaHat, SE, Corln, and Phenotypes must be in the same order.


A character vector of length one providing the name of the genetic variant. Default is ‘Variant’.


A logical vector of length one. If TRUE, the variance of the slab distribution that presents the prior distribution of non-null effects is updated at each MCMC iteration in a range (MinSlabVar – MaxSlabVar) (see next). If FALSE, it is fixed at (MinSlabVar + MaxSlabVar)/2. Default is TRUE.


A numeric value greater than 0.01 providing the minimum value of the variance of the slab distribution. Default is 0.6.


A numeric value smaller than 10.0 providing the maximum value of the variance of the slab distribution. Default is 1.0. **Note that, a smaller value of the slab variance will increase the sensitivity of CPBayes while selecting the optimal subset of associated traits but at the expense of lower specificity. Hence the slab variance parameter in CPBayes is inversely related to the level of false discovery rate (FDR) in a frequentist FDR controlling procedure. For a specific dataset, an user can experiment different choices of these three arguments: UpdateSlabVar, MinSlabVar, and MaxSlabVar.


A positive integer greater than or equal to 7,000 providing the total number of iterations in the MCMC. Default is 20,000.


A positive integer greater than or equal to 2,000 providing the burn in period in the MCMC. Default is 5,000. Note that the MCMC sample size (MCMCiter - Burnin) must be at least 5,000.


The output produced by cpbayes_cor is a list which consists of various components.


It is the name of the genetic variant provided by the user. If not specified by the user, default name is ‘Variant’.


It provides the log10(Bayes factor) produced by CPBayes that measures the evidence of the overall pleiotropic association.


It provides the local false discovery rate (posterior probability of null association) produced by CPBayes which is a measure of the evidence of aggregate-level pleiotropic association. Bayes factor is adjusted for prior odds, but locFDR is solely a function of the posterior odds. locFDR can sometimes be small indicating an association, but log10_BF may not indicate an association. Hence, always check both log10_BF and locFDR.


It provides the optimal subset of associated/non-null traits selected by CPBayes. It is NULL if no phenotype is selected.


It provides the traits which yield a trait-specific posterior probability of association (PPAj) > 20%. Even if a phenotype is not selected in the optimal subset of non-null traits, it can produce a non-negligible value of PPAj. Note that, ‘important_traits’ is expected to include the traits already contained in ‘subset’. It provides both the name of the important traits and their corresponding value of PPAj. Always check 'important_traits' even if 'subset' contains a single trait. It helps to better explain an observed pleiotropic signal.


It contains supplementary data including the MCMC data which is used later by post_summaries and forest_cpbayes:

  1. traitNames: Name of all the phenotypes.

  2. K: Total number of phenotypes.

  3. mcmc.samplesize: MCMC sample size.

  4. PPAj: Trait-specific posterior probability of association for all the traits.

  5. Z.data: MCMC data on the latent association status of all the traits (Z).

  6. sim.beta: MCMC data on the unknown true genetic effect (beta) on each trait.

  7. betahat: The beta-hat vector provided by the user which will be used by forest_cpbayes.

  8. se: The standard error vector provided by the user which will be used by forest_cpbayes.


'Yes' or 'No'. Whether the combined strategy of CPBayes (implemented for correlated summary statistics) used the uncorrelated version or not.


It provides the runtime (in seconds) taken by cpbayes_cor. It will help the user to plan the whole analysis.


Majumdar A, Haldar T, Bhattacharya S, Witte JS (2018) An efficient Bayesian meta analysis approach for studying cross-phenotype genetic associations. PLoS Genet 14(2): e1007139.

See Also

analytic_locFDR_BF_cor, estimate_corln, post_summaries, forest_cpbayes, analytic_locFDR_BF_uncor, cpbayes_uncor


BetaHat <- ExampleDataCor$BetaHat
SE <- ExampleDataCor$SE
cor <- ExampleDataCor$cor
traitNames <- paste("Disease", 1:10, sep = "")
SNP1 <- "rs1234"
result <- cpbayes_cor(BetaHat, SE, cor, Phenotypes = traitNames, Variant = SNP1)

ArunabhaCodes/CPBayes documentation built on May 5, 2019, 8:12 a.m.