Sliding_Window_cond_spa: Genetic region conditional analysis of sliding windows using...

View source: R/Sliding_Window_cond_spa.R

Sliding_Window_cond_spaR Documentation

Genetic region conditional analysis of sliding windows using STAAR procedure for imbalanced case-control setting

Description

The Sliding_Window_cond_spa function takes in chromosome, starting location, ending location, the object of opened annotated GDS file, the object from fitting the null model, and the set of known variants to be adjusted for in conditional analysis to analyze the conditional association between an imbalanced case-control phenotype and variants in a genetic region by using STAAR procedure. For each sliding window, the conditional STAAR-B p-value is a p-value from an omnibus test that aggregated conditional Burden(1,25) and Burden(1,1), together with conditional p-values of each test weighted by each annotation using Cauchy method.

Usage

Sliding_Window_cond_spa(
  chr,
  start_loc,
  end_loc,
  genofile,
  obj_nullmodel,
  known_loci = NULL,
  rare_maf_cutoff = 0.01,
  rv_num_cutoff = 2,
  rv_num_cutoff_max = 1e+09,
  rv_num_cutoff_max_prefilter = 1e+09,
  QC_label = "annotation/filter",
  variant_type = c("SNV", "Indel", "variant"),
  geno_missing_imputation = c("mean", "minor"),
  Annotation_dir = "annotation/info/FunctionalAnnotation",
  Annotation_name_catalog,
  Use_annotation_weights = c(TRUE, FALSE),
  Annotation_name = NULL,
  SPA_p_filter = FALSE,
  p_filter_cutoff = 0.05
)

Arguments

chr

chromosome.

start_loc

starting location (position) of the sliding window to be analyzed using STAAR procedure.

end_loc

ending location (position) of the sliding window to be analyzed using STAAR procedure.

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function.

known_loci

the data frame of variants to be adjusted for in conditional analysis and should contain 4 columns in the following order: chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT) (default = NULL).

rare_maf_cutoff

the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01).

rv_num_cutoff

the cutoff of minimum number of variants of analyzing a given variant-set (default = 2).

rv_num_cutoff_max

the cutoff of maximum number of variants of analyzing a given variant-set (default = 1e+09).

rv_num_cutoff_max_prefilter

the cutoff of maximum number of variants before extracting the genotype matrix (default = 1e+09).

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the analysis. Choices include "SNV", "Indel", or "variant" (default = "SNV").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

Annotation_dir

channel name of the annotations in the aGDS file
(default = "annotation/info/FunctionalAnnotation").

Annotation_name_catalog

a data frame containing the name and the corresponding channel name in the aGDS file.

Use_annotation_weights

use annotations as weights or not (default = TRUE).

Annotation_name

a vector of annotation names used in STAAR (default = NULL).

SPA_p_filter

logical: are only the variants with a normal approximation based p-value smaller than a pre-specified threshold use the SPA method to recalculate the p-value, only used for imbalanced case-control setting (default = FALSE).

p_filter_cutoff

threshold for the p-value recalculation using the SPA method, only used for imbalanced case-control setting (default = 0.05).

Value

A data frame containing the conditional STAAR p-values (including STAAR-B) corresponding to the sliding window in the given genetic region.

References

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)

Li, X., Li, Z., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. (pub)

Sofer, T., et al. (2019). A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genetic Epidemiology, 43(3), 263-275. (pub)


xihaoli/STAARpipeline documentation built on Feb. 9, 2025, 12:39 a.m.