knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dpi = 80 )
This vignette demonstrates how to interpret the output of ColocBoost, specifically to get the summary of colocalization and focusing only on strong colocalization events.
library(colocboost)
The dataset features two causal variants with indices 194 and 589.
# Loading the Dataset data(Ind_5traits) # Run colocboost res <- colocboost(X = Ind_5traits$X, Y = Ind_5traits$Y) cos_summary <- res$cos_summary names(cos_summary)
The cos_summary
object contains the colocalization summary for all colocalization events, with each row representing a single colocalization event.
The summary includes the following columns:
FALSE
if no focal outcome exists.To obtain the summary of colocalization with a specific focus on traits of interest,
you can use the get_cos_summary
, see the detailed usage of this function in link.
This function allows you to filter the colocalization summary based on a particular outcome of interest,
making it easier to interpret the results for specific traits.
For example, if you are interested in the colocalization events involving the traits Y1
and Y2
, you can use the following code:
# Get summary table of colocalization cos_interest_outcome <- get_cos_summary(res, interest_outcome = c("Y1", "Y2"))
In cos_summary
, for each 95% CoS, the cos_npc
column provides a normalized probability of colocalization and
min_npc_outcome
column provides the minimum normalized probability among colocalized traits.
Those two metrics are measured as an empirical evidence of colocalization both in CoS-level and in trait-level.
To obtain the best minimal colocalization configuration can be defined by using both cos_npc
and npc_outcome
.
See the detailed usage of this function in link.
filter_res <- get_robust_colocalization(res, cos_npc_cutoff = 0.5, npc_outcome_cutoff = 0.2)
get_robust_colocalization
is the same as output from colocboost
, which can be directly used in any post inference and visualization.npc=0.5
or npc_outcome = 0.2
maintains robust colocalization signals for cases when many traits are evaluated.
Higher thresholds can be specified if users want to focus only on strong colocalization events.The entire colocalization output from colocboost
is stored in the colocboost
object, which contains several components:
cos_summary
: A summary table for colocalization events (see details in above Section 1).vcp
: The variable colocalized probability for each variable.cos_details
: A object with all information for colocalization results.data_info
: A object with the detailed information from input data.model_info
: A object with the detailed information for colocboost model.In this section, we will provide a detailed explanation of the components for deepening into ColocBoost result using a mixed dataset.
# Load example data data(Ind_5traits) data(Sumstat_5traits) # Create a mixed dataset X <- Ind_5traits$X[1:4] Y <- Ind_5traits$Y[1:4] sumstat <- Sumstat_5traits$sumstat[5] LD <- get_cormat(Ind_5traits$X[[1]]) # Run colocboost res <- colocboost(X = X, Y = Y, sumstat = sumstat, LD = LD)
vcp
)vcp
is the probability of a variant being colocalized with at least one traits, serving as analogs of posterior inclusion probabilities (PIPs) in single-trait fine-mapping.
To plot the VCP for the variants within at least one CoS, you can use the colocboost_plot
function with the y
argument set to "vcp"
. colocboost_plot(res, y = "vcp")
Please visit our documentation portal
at Visualization of ColocBoost Results for more details on the colocboost_plot
function
data_info
)n_variables
: number of variants being included.variables
: vector of variant names across all traits being included in colocalization analysis.coef
: regression coefficients estimated from the colocboost model for each trait.z
: z-scores from marginal associations for each trait.n_outcomes
: the number of traits being included in colocalization analysis.outcome_info
contains information of analyzed data, including sample size and data type.res$data_info$outcome_info
cos_details
)cos_details
provides a detailed information for colocalization events identified by colocboost
.
This section will provide a detailed explanation of the components in cos_details
.
names(res$cos_details)
cos
)cos
: A list with a detailed information of colocalized variants for each CoS. cos_index
: Indices of colocalized variables with unique identifier for each CoS.cos_variables
: Names of colocalized variables with unique identifier for each CoS.res$cos_details$cos
cos_outcomes
)cos_outcomes
: A list with a detailed information of colocalized traits for each CoS. outcome_index
: Indices of colocalized traits with unique identifier for each CoS.outcome_name
: Names of colocalized traits with unique identifier for each CoS.res$cos_details$cos_outcomes
cos_npc
: normalized probability of colocalization for CoS, providing empirical evidence in favor of colocalization over a trait-specific configuration.cos_outcomes_npc
: normalized probability for each colocalized trait in order with evidence strength.res$cos_details$cos_npc res$cos_details$cos_outcomes_npc
cos_purity
: includes three lists, for each list, it contains $S \times S$ matrix, where $S$ is the number of CoS. min_abs_cor
: the minimum absolute correlation of variants within (diagonal) CoS or in-between (off-diagonal) different CoS.median_abs_cor
: the median absolute correlation of variants within (diagonal) CoS or in-between (off-diagonal) different CoS.max_abs_cor
: the maximum absolute correlation of variants within (diagonal) CoS or in-between (off-diagonal) different CoS.res$cos_details$cos_purity
cos_top_variables
: indices and names of the top variant for each CoS, which is the variant with the highest VCP.res$cos_details$cos_top_variables
cos_weights
: the integrative weights for each colocalized trait in the CoS. This is used to recalibrate CoS when some traits are filtered out..cos_vcp
: the single-effect VCP for each CoS.model_info
)model_coveraged
: if the model is converged.outcome_model_coveraged
: if the trait-specific model is converged.n_updates
: number of boosting roundsoutcome_n_updates
: number of boosting rounds for each trait.jk_update
: indices of the variants being updated in the model at each boosting round. # Pick arbitrary SEC updates, see entire update in advance res$model_info$jk_star[c(5:10,36:38), ]
profile_loglik
: joint profile log-likelihood changes over boosting rounds.outcome_profile_loglik
: trait-specific profile log-likelihood changes over boosting rounds.# Plotting joint profile log-likelihood (blue) and trait-specific profile log-likelihood (red). par(mfrow=c(2,3),mar=c(4,4,2,1)) plot(res$model_info$profile_loglik, type="p", col="#3366CC", lwd=2, xlab="", ylab="Joint Profile") for(i in 1:5){ plot(res$model_info$outcome_profile_loglik[[i]], type="p", col="#CC3333", lwd=2, xlab="", ylab=paste0("Profile (Trait ", i, ")")) }
outcome_proximity_obj
: trait-specific proximity smoothed objective for each trait.outcome_coupled_best_update_obj
: objective at the (coupled) best update variant for each outcome.# Save to restore default options oldpar <- par(no.readonly = TRUE) # Plotting trait-specific proximity objective par(mfrow=c(2,3), mar=c(4,4,2,1)) for(i in 1:5){ plot(res$model_info$outcome_proximity_obj[[i]], type="p", col="#3366CC", lwd=2, xlab="", ylab="Trait-specific Objective", main = paste0("Trait ", i)) } par(oldpar)
# Save to restore default options oldpar <- par(no.readonly = TRUE) # Plotting trait-specific objective at the best update variant par(mfrow=c(2,3), mar=c(4,4,2,1)) for(i in 1:5){ plot(res$model_info$outcome_coupled_best_update_obj[[i]], type="p", col="#CC3333", lwd=2, xlab="", ylab=paste0("Objective at best update variant"), main = paste0("Trait ", i)) } par(oldpar)
ucos_details
)There is ucos_details
in ColocBoost output when setting output_level = 2
, including the trait-specific (uncolocalized) information from the single-effect learner (SEL).
# Create a mixed dataset data(Ind_5traits) data(Heterogeneous_Effect) X <- Ind_5traits$X[1:3] Y <- Ind_5traits$Y[1:3] X1 <- Heterogeneous_Effect$X Y1 <- Heterogeneous_Effect$Y[,1,drop=F] res <- colocboost(X = c(X, list(X1)), Y = c(Y, list(Y1)), output_level = 2) names(res$ucos_details)
ucos
)ucos
: A list containing a detailed information about trait-specific (uncolocalized) variants for each uCoS.ucos_index
: Indices of trait-specific (uncolocalized) variants.ucos_variables
: Names of trait-specific (uncolocalized) variants.res$ucos_details$ucos
ucos_outcomes
)ucos_outcomes
: A list with a detailed information about trait-specific (uncolocalized) outcomes for each uCoS.outcome_index
: Indices of trait-specific (uncolocalized) outcomes.outcome_name
: Names of trait-specific (uncolocalized) outcomes.res$ucos_details$ucos_outcomes
cos_ucos_purity
)cos_ucos_purity
: Includes three lists, each containing an $S \times uS$ matrix, where $S$ is the number of CoS and $uS$ is the number of uCoS:min_abs_cor
: Minimum absolute correlation of variables across each pair of CoS and uCoS.median_abs_cor
: Median absolute correlation of variables across each pair of CoS and uCoS.max_abs_cor
: Maximum absolute correlation of variables across each pair of CoS and uCoS.res$ucos_details$cos_ucos_purity
ucos_weight
: Integrative weights for each trait-specific (uncolocalized) trait, used to recalibrate uCoS when traits are filtered out.ucos_top_variables
: Indices and names of the top variable for each uCoS, which is the variable with the highest VCP.ucos_purity
: Includes three lists, each containing an $uS \times uS$ matrix, where $uS$ is the number of uCoS:min_abs_cor
: Minimum absolute correlation of variables within (diagonal) uCoS or between (off-diagonal) different uCoS.median_abs_cor
: Median absolute correlation of variables within or between uCoS.max_abs_cor
: Maximum absolute correlation of variables within or between uCoS.By analyzing these components, you can gain a deeper understanding of trait-specific (uncolocalized) effects that are not colocalized, providing additional insights into the data.
diagnostic_details
)There is diagnostic_details
in ColocBoost output when setting output_level = 3
:
# Loading the dataset data(Ind_5traits) X <- Ind_5traits$X Y <- Ind_5traits$Y res <- colocboost(X = X, Y = Y, output_level = 3)
cb_model
: trait-specific proximity gradient boosting model, including proximity weight at each iteration, residual after gradient boosting, etc.weights_paths
: individual trait-specific weights for each iteration.names(res$diagnostic_details$cb_model) names(res$diagnostic_details$cb_model$ind_outcome_1)
cb_model_para
: parameters used in fitting ColocBoost model.names(res$diagnostic_details$cb_model_para)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.