Gene_Centric_Noncoding_Info: Functionally annotate rare variants in a noncoding mask

View source: R/Gene_Centric_Noncoding_Info.R

Gene_Centric_Noncoding_InfoR Documentation

Functionally annotate rare variants in a noncoding mask

Description

The Gene_Centric_Noncoding_Info function takes in a noncoding mask of a gene to functionally annotate the rare variants in the mask.

Usage

Gene_Centric_Noncoding_Info(
  category = c("downstream", "upstream", "UTR", "promoter_CAGE", "promoter_DHS",
    "enhancer_CAGE", "enhancer_DHS", "ncRNA"),
  chr,
  genofile,
  obj_nullmodel,
  gene_name,
  known_loci = NULL,
  rare_maf_cutoff = 0.01,
  method_cond = c("optimal", "naive"),
  QC_label = "annotation/filter",
  variant_type = c("SNV", "Indel", "variant"),
  geno_missing_imputation = c("mean", "minor"),
  Annotation_dir = "annotation/info/FunctionalAnnotation",
  Annotation_name_catalog,
  Annotation_name
)

Arguments

category

the noncoding functional category to be functionally annotated. Choices include downstream, upstream, UTR, promoter_CAGE, promoter_DHS, enhancer_CAGE, enhancer_DHS, ncRNA (default = downstream).

chr

chromosome.

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function in the STAARpipeline package, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function in the STAARpipeline package.

gene_name

name of the gene to be annotated.

known_loci

the data frame of variants to be adjusted for in conditional analysis and should contain 4 columns in the following order: chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT) (default = NULL).

rare_maf_cutoff

the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01).

method_cond

a character value indicating the method for conditional analysis. optimal refers to regressing residuals from the null model on known_loci as well as all covariates used in fitting the null model (fully adjusted) and taking the residuals; naive refers to regressing residuals from the null model on known_loci and taking the residuals (default = optimal).

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the conditional analysis. Choices include "SNV", "Indel", or "variant" (default = "SNV").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

Annotation_dir

channel name of the annotations in the aGDS file
(default = "annotation/info/FunctionalAnnotation").

Annotation_name_catalog

a data frame containing the name and the corresponding channel name in the aGDS file.

Annotation_name

a vector of qualitative/quantitative annotation names user wants to extract.

Value

a data frame containing the basic information (chromosome, position, reference allele and alternative allele), unconditional and conditional the score test p-values (not provided for imbalanced case-control setting), and annotation scores for the rare variants of the input noncoding mask.

References

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)


xihaoli/STAARpipelineSummary documentation built on Oct. 20, 2024, 9:35 p.m.