Gene_Centric_Coding_cond: Gene-centric conditional analysis of coding functional...

View source: R/Gene_Centric_Coding_cond.R

Gene_Centric_Coding_condR Documentation

Gene-centric conditional analysis of coding functional categories using STAAR procedure

Description

The Gene_Centric_Coding_cond function takes in chromosome, gene name, functional category, the object of opened annotated GDS file, the object from fitting the null model, and the set of known variants to be adjusted for in conditional analysis to analyze the conditional association between a quantitative/dichotomous phenotype and coding functional categories of a gene by using STAAR procedure. For each coding functional category, the conditional STAAR-O p-value is a p-value from an omnibus test that aggregated conditional SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25), and ACAT-V(1,1) together with conditional p-values of each test weighted by each annotation using Cauchy method. For multiple phenotype analysis (obj_nullmodel$n.pheno > 1), the results correspond to multi-trait conditional p-values (e.g. conditional MultiSTAAR-O) by leveraging the correlation structure between multiple phenotypes.

Usage

Gene_Centric_Coding_cond(
  chr,
  gene_name,
  category = c("plof", "plof_ds", "missense", "disruptive_missense", "synonymous", "ptv",
    "ptv_ds"),
  genofile,
  obj_nullmodel,
  known_loci = NULL,
  rare_maf_cutoff = 0.01,
  rv_num_cutoff = 2,
  rv_num_cutoff_max = 1e+09,
  rv_num_cutoff_max_prefilter = 1e+09,
  method_cond = c("optimal", "naive"),
  QC_label = "annotation/filter",
  variant_type = c("SNV", "Indel", "variant"),
  geno_missing_imputation = c("mean", "minor"),
  Annotation_dir = "annotation/info/FunctionalAnnotation",
  Annotation_name_catalog,
  Use_annotation_weights = c(TRUE, FALSE),
  Annotation_name = NULL
)

Arguments

chr

chromosome.

gene_name

name of the gene to be analyzed using STAAR procedure.

category

the coding functional category to be analyzed using STAAR procedure. Choices include plof, plof_ds, missense, disruptive_missense, synonymous, ptv, ptv_ds (default = plof).

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function.

known_loci

the data frame of variants to be adjusted for in conditional analysis and should contain 4 columns in the following order: chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT) (default = NULL).

rare_maf_cutoff

the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01).

rv_num_cutoff

the cutoff of minimum number of variants of analyzing a given variant-set (default = 2).

rv_num_cutoff_max

the cutoff of maximum number of variants of analyzing a given variant-set (default = 1e+09).

rv_num_cutoff_max_prefilter

the cutoff of maximum number of variants before extracting the genotype matrix (default = 1e+09).

method_cond

a character value indicating the method for conditional analysis. optimal refers to regressing residuals from the null model on known_loci as well as all covariates used in fitting the null model (fully adjusted) and taking the residuals; naive refers to regressing residuals from the null model on known_loci and taking the residuals (default = optimal).

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the analysis. Choices include "SNV", "Indel", or "variant" (default = "SNV").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

Annotation_dir

channel name of the annotations in the aGDS file
(default = "annotation/info/FunctionalAnnotation").

Annotation_name_catalog

a data frame containing the name and the corresponding channel name in the aGDS file.

Use_annotation_weights

use annotations as weights or not (default = TRUE).

Annotation_name

a vector of annotation names used in STAAR (default = NULL).

Value

A data frame containing the conditional STAAR p-values (including STAAR-O) corresponding to each coding functional category of the given gene.

References

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)

Li, X., Li, Z., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. (pub)

Sofer, T., et al. (2019). A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genetic Epidemiology, 43(3), 263-275. (pub)


xihaoli/STAARpipeline documentation built on Feb. 9, 2025, 12:39 a.m.