Single_Variants_List_Analysis: Calculate individual-variant p-values of a list of variants

View source: R/Single_Variants_List_Analysis.R

Single_Variants_List_AnalysisR Documentation

Calculate individual-variant p-values of a list of variants

Description

The Single_Variants_List_Analysis function takes in a list of variants to calculate the p-values and effect sizes of the input variants (effect size estimations are not provided for imbalanced case-control setting). Note: this function only supports for null model fitting using sparse GRM.

Usage

Single_Variants_List_Analysis(
  agds_dir,
  single_variants_list,
  obj_nullmodel,
  QC_label = "annotation/filter",
  geno_missing_imputation = c("mean", "minor"),
  p_filter_cutoff = 0.05,
  tol = .Machine$double.eps^0.25,
  max_iter = 1000
)

Arguments

agds_dir

file directory of annotated GDS (aGDS) files for all chromosomes (1-22).

single_variants_list

name a data frame containing the information of variants to be functionally annotated. The data frame must include 4 columns with the following names: "CHR" (chromosome number), "POS" (position), "REF" (reference allele), and "ALT" (alternative allele).

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function in the STAARpipeline package, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function in the STAARpipeline package.

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

p_filter_cutoff

threshold for the p-value recalculation using the SPA method (default = 0.05)

tol

a positive number specifying tolerance, the difference threshold for parameter estimates in saddlepoint approximation algorithm below which iterations should be stopped (default = ".Machine$double.eps^0.25").

max_iter

a positive integer specifying the maximum number of iterations for applying the saddlepoint approximation algorithm (default = "1000").

Value

a data frame containing the basic information (chromosome, position, reference allele and alternative allele) the score test p-values, and the effect sizes for the input variants.

References

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)


xihaoli/STAARpipelineSummary documentation built on July 27, 2024, 4:30 p.m.