Individual_Analysis_cond: Individual-variant conditional analysis using score test

View source: R/Individual_Analysis_cond.R

Individual_Analysis_condR Documentation

Individual-variant conditional analysis using score test

Description

The Individual_Analysis_cond function takes in the data frame of individual variants, the object of opened annotated GDS file, the object from fitting the null model, and the set of known variants to be adjusted for in conditional analysis to analyze the conditional association between a quantitative/dichotomous phenotype and each (significant) individual variant by using score test. For multiple phenotype analysis (obj_nullmodel$n.pheno > 1), the results correspond to multi-trait conditional score test p-values by leveraging the correlation structure between multiple phenotypes.

Usage

Individual_Analysis_cond(
  chr,
  individual_results,
  genofile,
  obj_nullmodel,
  known_loci = NULL,
  method_cond = c("optimal", "naive"),
  QC_label = "annotation/filter",
  variant_type = c("variant", "SNV", "Indel"),
  geno_missing_imputation = c("mean", "minor"),
  geno_position_ascending = TRUE
)

Arguments

chr

chromosome.

individual_results

the data frame of (significant) individual variants for conditional analysis using score test. The first 4 columns should correspond to chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT).

genofile

an object of opened annotated GDS (aGDS) file.

obj_nullmodel

an object from fitting the null model, which is either the output from fit_nullmodel function, or the output from fitNullModel function in the GENESIS package and transformed using the genesis2staar_nullmodel function.

known_loci

the data frame of variants to be adjusted for in conditional analysis and should contain 4 columns in the following order: chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT) (default = NULL).

method_cond

a character value indicating the method for conditional analysis. optimal refers to regressing residuals from the null model on known_loci as well as all covariates used in fitting the null model (fully adjusted) and taking the residuals; naive refers to regressing residuals from the null model on known_loci and taking the residuals (default = optimal).

QC_label

channel name of the QC label in the GDS/aGDS file (default = "annotation/filter").

variant_type

type of variant included in the analysis. Choices include "variant", "SNV", or "Indel" (default = "variant").

geno_missing_imputation

method of handling missing genotypes. Either "mean" or "minor" (default = "mean").

geno_position_ascending

logical: are the variant positions in ascending order in the GDS/aGDS file (default = TRUE).

Value

A data frame containing the conditional score test p-value and the estimated effect size of the minor allele for each (significant) individual variant in individual_results.

References

Chen, H., et al. (2016). Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics, 98(4), 653-666. (pub)

Sofer, T., et al. (2019). A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genetic Epidemiology, 43(3), 263-275. (pub)

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)


xihaoli/STAARpipeline documentation built on Feb. 9, 2025, 12:39 a.m.