View source: R/Sliding_Window.R
Sliding_Window | R Documentation |
The Sliding_Window
function takes in chromosome, starting location, ending location, sliding window length,
the object of opened annotated GDS file, and the object from fitting the null model to analyze the association between a
quantitative/dichotomous phenotype (including imbalanced case-control design) and variants in a genetic region by using STAAR procedure.
For each sliding window, the STAAR-O p-value is a p-value from an omnibus test
that aggregated SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25),
and ACAT-V(1,1) together with p-values of each test weighted by each annotation
using Cauchy method. For imbalance case-control setting, the results correspond to the STAAR-B p-value, which is a p-value from
an omnibus test that aggregated Burden(1,25) and Burden(1,1) together with p-values of each test weighted by each annotation using Cauchy method.
For multiple phenotype analysis (obj_nullmodel$n.pheno > 1
),
the results correspond to multi-trait association p-values (e.g. MultiSTAAR-O) by leveraging
the correlation structure between multiple phenotypes.
Sliding_Window(
chr,
start_loc,
end_loc,
sliding_window_length = 2000,
type = c("single", "multiple"),
genofile,
obj_nullmodel,
rare_maf_cutoff = 0.01,
rv_num_cutoff = 2,
rv_num_cutoff_max = 1e+09,
rv_num_cutoff_max_prefilter = 1e+09,
QC_label = "annotation/filter",
variant_type = c("SNV", "Indel", "variant"),
geno_missing_imputation = c("mean", "minor"),
Annotation_dir = "annotation/info/FunctionalAnnotation",
Annotation_name_catalog,
Use_annotation_weights = c(TRUE, FALSE),
Annotation_name = NULL,
SPA_p_filter = TRUE,
p_filter_cutoff = 0.05,
silent = FALSE
)
chr |
chromosome. |
start_loc |
starting location (position) of the genetic region to be analyzed using STAAR procedure. |
end_loc |
ending location (position) of the genetic region to be analyzed using STAAR procedure. |
sliding_window_length |
the (fixed) length of the sliding window to be analyzed using STAAR procedure. |
type |
the type of sliding window to be analyzed using STAAR procedure. Choices include
|
genofile |
an object of opened annotated GDS (aGDS) file. |
obj_nullmodel |
an object from fitting the null model, which is either the output from |
rare_maf_cutoff |
the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01). |
rv_num_cutoff |
the cutoff of minimum number of variants of analyzing a given variant-set (default = 2). |
rv_num_cutoff_max |
the cutoff of maximum number of variants of analyzing a given variant-set (default = 1e+09). |
rv_num_cutoff_max_prefilter |
the cutoff of maximum number of variants before extracting the genotype matrix (default = 1e+09). |
QC_label |
channel name of the QC label in the GDS/aGDS file (default = "annotation/filter"). |
variant_type |
type of variant included in the analysis. Choices include "SNV", "Indel", or "variant" (default = "SNV"). |
geno_missing_imputation |
method of handling missing genotypes. Either "mean" or "minor" (default = "mean"). |
Annotation_dir |
channel name of the annotations in the aGDS file |
Annotation_name_catalog |
a data frame containing the name and the corresponding channel name in the aGDS file. |
Use_annotation_weights |
use annotations as weights or not (default = TRUE). |
Annotation_name |
a vector of annotation names used in STAAR (default = NULL). |
SPA_p_filter |
logical: are only the variants with a normal approximation based p-value smaller than a pre-specified threshold use the SPA method to recalculate the p-value, only used for imbalanced case-control setting (default = TRUE). |
p_filter_cutoff |
threshold for the p-value recalculation using the SPA method, only used for imbalanced case-control setting (default = 0.05). |
silent |
logical: should the report of error messages be suppressed (default = FALSE). |
A data frame containing the STAAR p-values (including STAAR-O or STAAR-B in imbalanced case-control setting) corresponding to each sliding window in the given genetic region.
Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)
Li, X., Li, Z., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. (pub)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.