View source: R/Sliding_Window_Results_Summary.R
Sliding_Window_Results_Summary | R Documentation |
STAARpipeline
packageThe Sliding_Window_Results_Summary
function takes in the results of sliding window analysis,
the object from fitting the null model, and the set of known variants to be adjusted for in conditional analysis
to summarize the sliding window analysis results and analyze the conditional association between a quantitative/dichotomous phenotype
(including imbalanced case-control setting) and
the rare variants in the unconditional significant genetic region.
Sliding_Window_Results_Summary(
agds_dir,
jobs_num,
input_path,
output_path,
sliding_window_results_name,
obj_nullmodel,
known_loci = NULL,
cMAC_cutoff = 0,
method_cond = c("optimal", "naive"),
rare_maf_cutoff = 0.01,
QC_label = "annotation/filter",
variant_type = c("SNV", "Indel", "variant"),
geno_missing_imputation = c("mean", "minor"),
Annotation_dir = "annotation/info/FunctionalAnnotation",
Annotation_name_catalog,
Use_annotation_weights = FALSE,
Annotation_name = NULL,
alpha = 0.05,
manhattan_plot = FALSE,
QQ_plot = FALSE,
cond_null_model_name = NULL,
cond_null_model_dir = NULL,
SPA_p_filter = FALSE,
p_filter_cutoff = 0.05
)
agds_dir |
file directory of annotated GDS (aGDS) files for all chromosomes (1-22). |
jobs_num |
a data frame containing the number of jobs for association analysis. The data frame must include a column with the name "sliding_window_num" |
input_path |
file directory of the sliding window analysis results. |
output_path |
file output directory of the summary results. |
sliding_window_results_name |
the file name of the input sliding window analysis results. |
obj_nullmodel |
an object from fitting the null model, which is either the output from |
known_loci |
the data frame of variants to be adjusted for in conditional analysis and should contain 4 columns in the following order: chromosome (CHR), position (POS), reference allele (REF), and alternative allele (ALT) (default = NULL). |
cMAC_cutoff |
the cutoff of the minimum number of the cumulative minor allele of variants in the masks when summarizing the results (default = 0). |
method_cond |
a character value indicating the method for conditional analysis.
|
rare_maf_cutoff |
the cutoff of maximum minor allele frequency in defining rare variants (default = 0.01). |
QC_label |
channel name of the QC label in the GDS/aGDS file (default = "annotation/filter"). |
variant_type |
variants include in the conditional analysis. Choices include "variant", "SNV", or "Indel" (default = "SNV"). |
geno_missing_imputation |
method of handling missing genotypes. Either "mean" or "minor" (default = "mean"). |
Annotation_dir |
channel name of the annotations in the aGDS file |
Annotation_name_catalog |
a data frame containing the name and the corresponding channel name in the aGDS file. |
Use_annotation_weights |
use annotations as weights or not (default = FALSE). |
Annotation_name |
a vector of annotation names used in STAAR (default = NULL). |
alpha |
threshod to control the genome-wise (family-wise) error rate (default = 0.05), the p-value threshold is alpha/total number of sliding windows |
manhattan_plot |
output manhattan plot or not (default = FALSE). |
QQ_plot |
output Q-Q plot or not (default = FALSE). |
cond_null_model_name |
the null model name for conditional analysis in the SPA setting, only used for imbalanced case-control setting (default = NULL). |
cond_null_model_dir |
the directory of storing the null model for conditional analysis in the SPA setting, only used for imbalanced case-control setting (default = NULL). |
SPA_p_filter |
logical: are only the variants with a normal approximation based p-value smaller than a pre-specified threshold use the SPA method to recalculate the p-value, only used for imbalanced case-control setting (default = FALSE). |
p_filter_cutoff |
threshold for the p-value recalculation using the SPA method, only used for imbalanced case-control setting (default = 0.05). |
The function returns the following analysis results:
results_sliding_window_genome.Rdata
: a matrix contains the STAAR p-values (including STAAR-O or STAAR-B in imbalanced case-control setting) of the sliding windows across the genome.
sliding_window_sig.Rdata
and sliding_window_sig.csv
: a matrix contains the unconditional STAAR p-values (including STAAR-O or STAAR-B in imbalanced case-control setting) of the significant sliding windows (unconditional p-value<alpha/total number of sliding windows).
sliding_window_sig_cond.Rdata
and sliding_window_sig_cond.csv
: a matrix contains the conditional STAAR p-values (including STAAR-O or STAAR-B in imbalanced case-control setting) of the significant sliding windows (available if known_loci is not a NULL).
manhattan plot (optional) and Q-Q plot (optional) of the sliding window analysis results.
Li, Z., Li, X., et al. (2022). A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. (pub)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.