pathWAS_predictR: Predict MR pathway models in pathway PRS

View source: R/pathWAS_predictR.R

pathWAS_predictRR Documentation

Predict MR pathway models in pathway PRS

Description

Creates a prediction statistic for your pathway model based on pathway PRS

Usage

pathWAS_predictR(
  predict_PRS,
  path_qtl_ovgenes,
  path_select,
  mr_lasso_res,
  endpoint_omics,
  end_point,
  run_sig_MR = FALSE,
  sig_mr_genelist = NULL
)

Arguments

predict_PRS

data frame. 1st column must be titled "iid". Every subsequent column is a PRS for a single gene.

path_qtl_ovgenes

list. List of all genes overlapping between your pathway and the QTLs available. These names must be in the same format as the column names for your PRS.

path_select

character. Name of the pathway for the analysis.

mr_lasso_res

LASSO model. Output [1] of the pathWAS_MR function.

endpoint_omics

data frame. 2 columns. The first MUST be titled "iid" and the second is the omics measurement of your end-point. These IIDs must be the same as for the PRS data frame.

end_point

character. Name of the omics end-point measured and used as the proxy for pathway functionality.

run_sig_MR

logical. Also run a prediction on only significant exposures (genes) from the MR. Default is FALSE.

sig_mr_genelist

list. If run_sig_MR == TRUE, you must include this. Output [2] from the pathWAS_MR function.

Details

pathWAS_predictR is the final step in PathWAS. It requires the output from pathWAS_MR which contains the MR-derived gene-exposure values. As well as this it requires two important aspects for creating the final PathWAS model. 1. It requires your end-point proteomics measures. The end-point selected at the start of PathWAS must be input along with a data frame of iids + omics measures. This MUST be in the format of 2 columns: iid, metabolite/protein (where the name of the second column is in the format of: "XXX_omic" where "XXX" can be anything). 2. It requires polygenic risk scores (PRS) for every gene in your pathway. This has to be created in the same cohort for which you have your omics measurements. I.e. you must take the QTLs used for the previous stages of PathWAS and then use those SNPs to create weights using the genotype of your cohort. For this segment we recommend the snp_ldpred2_auto() function from LDpred2 to create polygenic weights: https://privefl.github.io/bigsnpr/articles/LDpred2.html. Then we recommend using PRsice2 to create PRS from each polygenic weight. In theory you should then have a data frame of PRS for each gene in your pathway in the format of: iid, GENE1_PRS, GENE2_PRS, GENE3_PRS (where the 1st column MUST be "iid" and every subsequent column must be the name of a gene). You will also need to input the list of genes for your pathway, in the same format as the column names from your data frame of PRS. Lastly, there is the option of running this on all of the genes from your pathway or on those genes which were significant within the MR. By setting the run_sig_MR argument to TRUE you will output a model for all genes within the pathway that are available and also for those which were significant in the MR analysis, providing you with two seprate models. If this is set to true, you will also need to input the sig_mr_genelist which is output from the pathWAS_MR.


Sabor117/PathWAS documentation built on Nov. 29, 2024, 7:44 a.m.