View source: R/Individual_Analysis.R
Individual_Analysis | R Documentation |
The Individual_Analysis
function takes in chromosome, starting location, ending location,
the object of opened annotated GDS file, and the object from fitting the null model to analyze the association between a
quantitative/dichotomous phenotype (including imbalanced case-control design) and each individual variant in a genetic region by using score test.
For multiple phenotype analysis (obj_nullmodel$n.pheno > 1
),
the results correspond to multi-trait score test p-values by leveraging
the correlation structure between multiple phenotypes.
Individual_Analysis(
chr,
start_loc,
end_loc,
genofile,
obj_nullmodel,
mac_cutoff = 20,
subset_variants_num = 5000,
QC_label = "annotation/filter",
variant_type = c("variant", "SNV", "Indel"),
geno_missing_imputation = c("mean", "minor"),
tol = .Machine$double.eps^0.25,
max_iter = 1000,
SPA_p_filter = TRUE,
p_filter_cutoff = 0.05
)
chr |
chromosome. |
start_loc |
starting location (position) of the genetic region for each individual variant to be analyzed using score test. |
end_loc |
ending location (position) of the genetic region for each individual variant to be analyzed using score test. |
genofile |
an object of opened annotated GDS (aGDS) file. |
obj_nullmodel |
an object from fitting the null model, which is either the output from |
mac_cutoff |
the cutoff of minimum minor allele count in defining individual variants (default = 20). |
subset_variants_num |
the number of variants to run per subset for each time (default = 5e3). |
QC_label |
channel name of the QC label in the GDS/aGDS file (default = "annotation/filter"). |
variant_type |
type of variant included in the analysis. Choices include "variant", "SNV", or "Indel" (default = "variant"). |
geno_missing_imputation |
method of handling missing genotypes. Either "mean" or "minor" (default = "mean"). |
tol |
a positive number specifying tolerance, the difference threshold for parameter estimates in saddlepoint approximation algorithm below which iterations should be stopped (default = ".Machine$double.eps^0.25"). |
max_iter |
a positive integer specifying the maximum number of iterations for applying the saddlepoint approximation algorithm (default = "1000"). |
SPA_p_filter |
logical: are only the variants with a score-test-based p-value smaller than a pre-specified threshold use the SPA method to recalculate the p-value, only used for imbalanced case-control setting (default = TRUE). |
p_filter_cutoff |
threshold for the p-value recalculation using the SPA method, only used for imbalanced case-control setting (default = 0.05) |
A data frame containing the score test p-value and the estimated effect size of the minor allele for each individual variant in the given genetic region. The first 4 columns correspond to chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT).
Chen, H., et al. (2016). Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics, 98(4), 653-666. (pub)
Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.