create_prs | R Documentation |
Create a polygenic risk score based on summary statistics from prior GWAS/pQTL discovery studies. Included variants in high LD can be decorrelated to prevent double-counting, using the conditional argument.
create_prs(
variant_data,
gwas_info,
remove_indels = FALSE,
imp_threshold = 0.8,
binary_outcome = TRUE,
exclude_extreme_associations = TRUE,
maf_filter = 0.001,
LDplot = FALSE,
pruning_threshold = 0.75,
pruning_filter = "p",
pval_threshold = 5e-08,
conditional = FALSE,
cond_window = 35000,
cond_N = 60000,
cond_stepwise = TRUE,
ridge = FALSE,
lambda = 0,
scale = FALSE,
flowchart = TRUE
)
variant_data |
An object of format output by extract_variants(). |
gwas_info |
An object generated by get_trait_variants() or get_pQTLs(). |
remove_indels |
If TRUE, removes indels. |
imp_threshold |
Imputation quality threshold, based on R^2. Any variant with lower imputation R^2 is removed. |
binary_outcome |
Set to TRUE for binary traits, and FALSE for continuous outcomes (including pQTLs). |
exclude_extreme_associations |
If TRUE, removes variants with an odds ratio > 5 or <1/5. |
maf_filter |
Variants with a MAF below the specified threshold are filtered. |
LDplot |
If TRUE, plots the LD matrix (squared correlation matrix of variants). |
pruning_threshold |
Variants in LD >= pruning_threshold with other variants are removed. |
pruning_filter |
The criterion by which to keep one variant of a high LD pair. By default, it keeps the variant with the lower p-value, but keeping the variant with higher MAF is also possible. |
pval_threshold |
Variants with GWAS p-values > pval_threshold are discarded. Set to 1 to turn off. |
conditional |
If TRUE, uses marg2con() to decorrelate variants in LD if these are within a given bp distance on the same chromosome. |
cond_window |
A genomic distance within which to decorrelate variants in LD, in base pair. |
cond_N |
The sample size of the original GWAS from which the marginal estimates were derived, or an approximation of it. |
cond_stepwise |
If TRUE, iteratively applies conditional analysis conditioning on the top variant within the specified window, each time removing variants for which p > pval_threshold, until only conditionally independent variants remain. When set to FALSE, conditional joint analysis (i.e. of several variants jointly) is applied within the specified window. |
ridge |
If TRUE, applies a ridge penalty. This only applies when conditional is set to TRUE. |
lambda |
The parameter controlling the degree of ridge regularization. |
scale |
Centers and standardizes the polygenic risk score if TRUE. |
flowchart |
If TRUE, plots a flowchart describing the creation of the polygenic risk score. |
A list containing several data.frames with all relevant information. The risk score is stored in element 'prs'.
# vte_prs <- create_prs(vte_extracted_variants, vte_gwas_info)
# hist(vte_prs$prs$prs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.