Individual_Analysis_cond_spa: Individual-variant conditional analysis using score test for...

View source: R/Individual_Analysis_cond_spa.R

Individual_Analysis_cond_spaR Documentation

Individual-variant conditional analysis using score test for imbalance case-control setting

Description

The Individual_Analysis_cond_spa function takes in chromosome, starting location, ending location, the object of opened annotated GDS file, and the object from fitting the null model to analyze the association between an imbalanced case-control phenotype and each individual variant in a genetic region by using score test.

Usage

Individual_Analysis_cond_spa(
  chr,
  individual_results,
  genofile,
  obj_nullmodel,
  QC_label = "annotation/filter",
  variant_type = c("variant", "SNV", "Indel"),
  geno_missing_imputation = c("mean", "minor"),
  tol = .Machine$double.eps^0.25,
  max_iter = 1000,
  SPA_p_filter = FALSE,
  p_filter_cutoff = 0.05
)

Arguments

chr

chromosome.

individual_results

the data frame of (significant) individual variants for conditional analysis using score test. The first 4 columns should correspond to chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT).

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function.

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the analysis. Choices include "variant", "SNV", or "Indel" (default = "variant").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

tol

a positive number specifying tolerance, the difference threshold for parameter estimates in saddlepoint approximation algorithm below which iterations should be stopped (default = ".Machine$double.eps^0.25").

max_iter

a positive integer specifying the maximum number of iterations for applying the saddlepoint approximation algorithm (default = "1000").

SPA_p_filter

logical: are only the variants with a score-test-based p-value smaller than a pre-specified threshold use the SPA method to recalculate the p-value (default = FALSE).

p_filter_cutoff

threshold for the p-value recalculation using the SPA method (default = 0.05)

Value

A data frame containing the score test p-value and the estimated effect size of the minor allele for each individual variant in the given genetic region. The first 4 columns correspond to chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT).

References

Chen, H., et al. (2016). Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics, 98(4), 653-666. (pub)

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)


xihaoli/STAARpipeline documentation built on Feb. 9, 2025, 12:39 a.m.