STAAR_cond: STAAR procedure for conditional analysis using omnibus test

View source: R/STAAR_cond.R

STAAR_condR Documentation

STAAR procedure for conditional analysis using omnibus test

Description

The STAAR_cond function takes in genotype, the genotype of variants to be adjusted for in conditional analysis, the object from fitting the null model, and functional annotation data to analyze the conditional association between a quantitative/dichotomous phenotype and a variant-set by using STAAR procedure, adjusting for a given list of variants. For each variant-set, the conditional STAAR-O p-value is a p-value from an omnibus test that aggregated conditional SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25), and ACAT-V(1,1) together with conditional p-values of each test weighted by each annotation using Cauchy method.

Usage

STAAR_cond(
  genotype,
  genotype_adj,
  obj_nullmodel,
  annotation_phred = NULL,
  rare_maf_cutoff = 0.01,
  rv_num_cutoff = 2,
  method_cond = c("optimal", "naive")
)

Arguments

genotype

an n*p genotype matrix (dosage matrix) of the target sequence, where n is the sample size and p is the number of genetic variants.

genotype_adj

an n*p_adj genotype matrix (dosage matrix) of the target sequence, where n is the sample size and p_adj is the number of genetic variants to be adjusted for in conditional analysis (or a vector of a single variant with length n if p_adj is 1).

obj_nullmodel

an object from fitting the null model, which is the output from either fit_null_glm function for unrelated samples or fit_null_glmmkin function for related samples. Note that fit_null_glmmkin is a wrapper of the glmmkin function from the GMMAT package.

annotation_phred

a data frame or matrix of functional annotation data of dimension p*q (or a vector of a single annotation score with length p). Continuous scores should be given in PHRED score scale, where the PHRED score of j-th variant is defined to be -10*log10(rank(-score_j)/total) across the genome. (Binary) categorical scores should be taking values 0 or 1, where 1 is functional and 0 is non-functional. If not provided, STAAR will perform the SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25), ACAT-V(1,1) and ACAT-O tests (default = NULL).

rare_maf_cutoff

the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01).

rv_num_cutoff

the cutoff of minimum number of variants of analyzing a given variant-set (default = 2).

method_cond

a character value indicating the method for conditional analysis. optimal refers to regressing residuals from the null model on genotype_adj as well as all covariates used in fitting the null model (fully adjusted) and taking the residuals; naive refers to regressing residuals from the null model on genotype_adj and taking the residuals (default = optimal).

Value

A list with the following members:

num_variant: the number of variants with minor allele frequency > 0 and less than rare_maf_cutoff in the given variant-set that are used for performing the variant-set using STAAR.

cMAC: the cumulative minor allele count of variants with minor allele frequency > 0 and less than rare_maf_cutoff in the given variant-set.

RV_label: the boolean vector indicating whether each variant in the given variant-set has minor allele frequency > 0 and less than rare_maf_cutoff.

results_STAAR_O_cond: the conditional STAAR-O p-value that aggregated conditional SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25), and ACAT-V(1,1) together with conditional p-values of each test weighted by each annotation using Cauchy method.

results_ACAT_O_cond: the conditional ACAT-O p-value that aggregated conditional SKAT(1,25), SKAT(1,1), Burden(1,25), Burden(1,1), ACAT-V(1,25), and ACAT-V(1,1) using Cauchy method.

results_STAAR_S_1_25_cond: a vector of conditional STAAR-S(1,25) p-values, including conditional SKAT(1,25) p-value weighted by MAF, the conditional SKAT(1,25) p-values weighted by each annotation, and a conditional STAAR-S(1,25) p-value by aggregating these p-values using Cauchy method.

results_STAAR_S_1_1_cond: a vector of conditional STAAR-S(1,1) p-values, including conditional SKAT(1,1) p-value weighted by MAF, the conditional SKAT(1,1) p-values weighted by each annotation, and a conditional STAAR-S(1,1) p-value by aggregating these p-values using Cauchy method.

results_STAAR_B_1_25_cond: a vector of conditional STAAR-B(1,25) p-values, including conditional Burden(1,25) p-value weighted by MAF, the conditional Burden(1,25) p-values weighted by each annotation, and a conditional STAAR-B(1,25) p-value by aggregating these p-values using Cauchy method.

results_STAAR_B_1_1_cond: a vector of conditional STAAR-B(1,1) p-values, including conditional Burden(1,1) p-value weighted by MAF, the conditional Burden(1,1) p-values weighted by each annotation, and a conditional STAAR-B(1,1) p-value by aggregating these p-values using Cauchy method.

results_STAAR_A_1_25_cond: a vector of conditional STAAR-A(1,25) p-values, including conditional ACAT-V(1,25) p-value weighted by MAF, the conditional ACAT-V(1,25) p-values weighted by each annotation, and a conditional STAAR-A(1,25) p-value by aggregating these p-values using Cauchy method.

results_STAAR_A_1_1_cond: a vector of conditional STAAR-A(1,1) p-values, including conditional ACAT-V(1,1) p-value weighted by MAF, the conditional ACAT-V(1,1) p-values weighted by each annotation, and a conditional STAAR-A(1,1) p-value by aggregating these p-values using Cauchy method.

References

Li, X., Li, Z., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. (pub)

Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)

Liu, Y., et al. (2019). Acat: A fast and powerful p value combination method for rare-variant analysis in sequencing studies. The American Journal of Human Genetics, 104(3), 410-421. (pub)

Li, Z., Li, X., et al. (2020). Dynamic scan procedure for detecting rare-variant association regions in whole-genome sequencing studies. The American Journal of Human Genetics, 104(5), 802-814. (pub)

Sofer, T., et al. (2019). A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genetic Epidemiology, 43(3), 263-275. (pub)


xihaoli/STAAR documentation built on Nov. 3, 2024, 9:34 p.m.