zimfrv | R Documentation |
Gene‐based association tests to model zero-inflated count data
This function performs gene‐based association tests between a set of SNPs/genes and zero-inflated count data using ZIP regression or ZINB regression or two-stage SKAT model framework.
zimfrv(
phenodata,
genedata,
genename = "NA",
weights = "Equal",
missing_cutoff = 0.15,
max_maf = 1,
model = "zip"
)
phenodata |
a data frame containing family and individual IDs for all objects as well as zero-inflated counts as a phenotype and a set of covariates. Each row represents a different individual. The first two columns are Family ID (FID) and Individual ID (IID) respectively. There must be one and only one phenotype in the third column and the phenotype have to be zero-inflated count data which should be non-negative integers, e.g. neuritic plaque counts. Each of the rest of columns represents a different covariate, e.g. age, sex, etc. |
genedata |
a data frame containing family and individual IDs for all objects as well as numeric genotype data. Each row represents a different individual. The first two columns are Family ID (FID) and Individual ID (IID) respectively. Each of the rest columns represents a seperate gene/SNP marker. The genotype should be coded as 0, 1, 2 and NA for AA, Aa, aa and missing. Both of Family ID (FID) and Individual ID (IID) for each row in the 'genedata' derived from the PLINK formatted files should be in the same order as in the 'phenodata'. The number of rows in 'genedata' should be equal to the number of rows in 'phenodata'. |
genename |
a character string of the name of a gene, e.g. "CETP". The name is case-sensitive. |
weights |
a character string of pre-specified variant weighting schemes (default="Equal"). "Equal" represents no weight, "MadsenBrowning" represents the Madsen and Browning (2009) weight, "Beta" represents the Beta weight. |
missing_cutoff |
a cutoff of the missing rates of SNPs (default=0.15). Any SNPs with missing rates higher than the cutoff will be excluded from the analysis. |
max_maf |
a cutoff of the maximum minor allele frequencies (MAF) (default=1, no cutoff). Any SNPs with MAF > cutoff will be excluded from the analysis. |
model |
character specification of zero-inflated count model family (default="zip"). "zip" represents Zero-Inflated Poisson model, "zinb" represents Zero-Inflated Negative Binomial model, "skat" represents the two-stage Sequence Kernel Association Test method. |
a list of 10 items including the name of gene, the number of rare variants in the genetic region, the kind of method used for modeling, and individual p-values of gene‐based association tests (burden test and kernel test for both parameters) and combined p-values using Cauchy combination test.
GeneName |
the name of gene. |
No.Var |
the number of rare variants in the gene. |
Method |
the method used to compute the p-values. |
p.value_pi_burden |
single p-value for parameter |
p.value_lambda_burden / p.value_mu_burden |
single p-value for parameter |
p.value_pi_kernel |
single p-value for parameter |
p.value_lambda_kernel / p.value_mu_kernel |
single p-value for parameter |
p.value_pi_combined |
Combined p-value of testing parameter |
p.value_lambda_combined / p.value_mu_combined |
Combined p-value of testing parameter |
p.value_overall |
Combined p-value of testing the overall association using Cauchy combination test. |
Fan, Q., Sun, S., & Li, Y.‐J. (2021). Precisely modeling zero‐inflated count phenotype for rare variants. Genetic Epidemiology, 1–14.
data(Ex1_phenodata)
data(Ex1_genedata)
zimfrv(Ex1_phenodata,Ex1_genedata,weights = "Beta",max_maf = 0.02,model="zinb")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.