Dynamic_Window_SCANG: Genetic region analysis of dynamic windows using SCANG-STAAR...

View source: R/Dynamic_Window_SCANG.R

Dynamic_Window_SCANGR Documentation

Genetic region analysis of dynamic windows using SCANG-STAAR procedure

Description

The Dynamic_Window_SCANG function takes in chromosome, starting location, ending location, the object of opened annotated GDS file, and the object from fitting the null model to analyze the association between a quantitative/dichotomous phenotype and variants in a genetic region by using SCANG-STAAR procedure. For each dynamic window, the scan statistic of SCANG-STAAR-O is the set-based p-value of an omnibus test that aggregated p-values across different types of multiple annotation-weighted variant-set tests SKAT(1,1), SKAT(1,25), Burden(1,1) and Burden(1,25) using ACAT method; the scan statistic of SCANG-STAAR-S is the set-based p-value of STAAR-S, which is an omnibus test that aggregated p-values across multiple annotation-weighted variant-set tests SKAT(1,1) and SKAT(1,25) using ACAT method; the scan statistic of SCANG-STAAR-B is the set-based p-value of STAAR-B, which is an omnibus test that aggregated p-values across multiple annotation-weighted variant-set tests Burden(1,1) and Burden(1,25) using ACAT method.

Usage

Dynamic_Window_SCANG(
  chr,
  start_loc,
  end_loc,
  genofile,
  obj_nullmodel,
  Lmin = 40,
  Lmax = 300,
  steplength = 10,
  rare_maf_cutoff = 0.01,
  p_filter = 1e-08,
  f = 0,
  alpha = 0.1,
  QC_label = "annotation/filter",
  variant_type = c("SNV", "Indel", "variant"),
  geno_missing_imputation = c("mean", "minor"),
  Annotation_dir = "annotation/info/FunctionalAnnotation",
  Annotation_name_catalog,
  Use_annotation_weights = c(TRUE, FALSE),
  Annotation_name = NULL,
  silent = FALSE
)

Arguments

chr

chromosome.

start_loc

starting location (position) of the genetic region to be analyzed using SCANG-STAAR procedure.

end_loc

ending location (position) of the genetic region to be analyzed using SCANG-STAAR procedure.

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is the output from fit_nullmodel function and transformed using the staar2scang_nullmodel function.

Lmin

minimum number of variants in searching windows (default = 40).

Lmax

maximum number of variants in searching windows (default = 300).

steplength

difference of number of variants in searching windows, that is, the number of variants in searching windows are Lmin, Lmin+steplength, Lmin+steplength,..., Lmax (default = 10).

rare_maf_cutoff

a cutoff of maximum minor allele frequency in defining rare variants (default = 0.01).

p_filter

a filtering threshold of screening method for SKAT in SCANG-STAAR. SKAT p-values are calculated for regions whose p-value is possibly smaller than the filtering threshold (default = 1e-8).

f

an overlap fraction, which controls for the overlapping proportion of of detected regions. For example, when f=0, the detected regions are non-overlapped with each other, and when f=1, we keep every susceptive region as detected regions (default = 0).

alpha

family-wise/genome-wide significance level (default = 0.1).

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the analysis. Choices include "SNV", "Indel", or "variant" (default = "SNV").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

Annotation_dir

channel name of the annotations in the aGDS file
(default = "annotation/info/FunctionalAnnotation").

Annotation_name_catalog

a data frame containing the name and the corresponding channel name in the aGDS file.

Use_annotation_weights

use annotations as weights or not (default = TRUE).

Annotation_name

a vector of annotation names used in SCANG-STAAR (default = NULL).

silent

logical: should the report of error messages be suppressed (default = FALSE).

Value

The function returns a list with the following members:

SCANG_O_res: A matrix that summarizes the significant region detected by SCANG-STAAR-O, including the negative log transformation of SCANG-STAAR-O p-value ("-logp"), chromosome ("chr"), start position ("start_pos"), end position ("end_pos"), family-wise/genome-wide error rate (GWER) and the number of variants ("SNV_num").

SCANG_O_top1: A vector of length 4 which summarizes the top 1 region detected by SCANG-STAAR-O. including the negative log transformation of SCANG-STAAR-O p-value ("-logp"), chromosome ("chr"), start position ("start_pos"), end position ("end_pos"), family-wise/genome-wide error rate (GWER) and the number of variants ("SNV_num").

SCANG_O_emthr: A vector of Monte Carlo simulation sample for generating the empirical threshold. The 1-alpha quantile of this vector is the empirical threshold.

SCANG_S_res, SCANG_S_top1, SCANG_S_emthr: Analysis results using SCANG-STAAR-S. Details see SCANG-STAAR-O.

SCANG_B_res, SCANG_B_top1, SCANG_B_emthr: Analysis results using SCANG-STAAR-B. Details see SCANG-STAAR-O.

References

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)

Li, Z., Li, X., et al. (2019). Dynamic scan procedure for detecting rare-variant association regions in whole-genome sequencing studies. The American Journal of Human Genetics, 104(5), 802-814. (pub)

Li, X., Li, Z., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. (pub)

Liu, Y., et al. (2019). Acat: A fast and powerful p value combination method for rare-variant analysis in sequencing studies. The American Journal of Human Genetics, 104(3), 410-421. (pub)


xihaoli/STAARpipeline documentation built on Feb. 9, 2025, 12:39 a.m.