zimfrv: zimfrv

View source: R/zimfrv.R

zimfrvR Documentation

zimfrv

Description

Gene‐based association tests to model zero-inflated count data

This function performs gene‐based association tests between a set of SNPs/genes and zero-inflated count data using ZIP regression or ZINB regression or two-stage SKAT model framework.

Usage

zimfrv(
  phenodata,
  genedata,
  genename = "NA",
  weights = "Equal",
  missing_cutoff = 0.15,
  max_maf = 1,
  model = "zip"
)

Arguments

phenodata

a data frame containing family and individual IDs for all objects as well as zero-inflated counts as a phenotype and a set of covariates. Each row represents a different individual. The first two columns are Family ID (FID) and Individual ID (IID) respectively. There must be one and only one phenotype in the third column and the phenotype have to be zero-inflated count data which should be non-negative integers, e.g. neuritic plaque counts. Each of the rest of columns represents a different covariate, e.g. age, sex, etc.

genedata

a data frame containing family and individual IDs for all objects as well as numeric genotype data. Each row represents a different individual. The first two columns are Family ID (FID) and Individual ID (IID) respectively. Each of the rest columns represents a seperate gene/SNP marker. The genotype should be coded as 0, 1, 2 and NA for AA, Aa, aa and missing. Both of Family ID (FID) and Individual ID (IID) for each row in the 'genedata' derived from the PLINK formatted files should be in the same order as in the 'phenodata'. The number of rows in 'genedata' should be equal to the number of rows in 'phenodata'.

genename

a character string of the name of a gene, e.g. "CETP". The name is case-sensitive.

weights

a character string of pre-specified variant weighting schemes (default="Equal"). "Equal" represents no weight, "MadsenBrowning" represents the Madsen and Browning (2009) weight, "Beta" represents the Beta weight.

missing_cutoff

a cutoff of the missing rates of SNPs (default=0.15). Any SNPs with missing rates higher than the cutoff will be excluded from the analysis.

max_maf

a cutoff of the maximum minor allele frequencies (MAF) (default=1, no cutoff). Any SNPs with MAF > cutoff will be excluded from the analysis.

model

character specification of zero-inflated count model family (default="zip"). "zip" represents Zero-Inflated Poisson model, "zinb" represents Zero-Inflated Negative Binomial model, "skat" represents the two-stage Sequence Kernel Association Test method.

Value

a list of 10 items including the name of gene, the number of rare variants in the genetic region, the kind of method used for modeling, and individual p-values of gene‐based association tests (burden test and kernel test for both parameters) and combined p-values using Cauchy combination test.

GeneName

the name of gene.

No.Var

the number of rare variants in the gene.

Method

the method used to compute the p-values.

p.value_pi_burden

single p-value for parameter \pi using burden test.

p.value_lambda_burden / p.value_mu_burden

single p-value for parameter \lambda or \mu using burden test.

p.value_pi_kernel

single p-value for parameter \pi using kernel test.

p.value_lambda_kernel / p.value_mu_kernel

single p-value for parameter \lambda or \mu using kernel test.

p.value_pi_combined

Combined p-value of testing parameter \pi from both burden and kernel test using Cauchy combination test.

p.value_lambda_combined / p.value_mu_combined

Combined p-value of testing parameter \lambda or \mu from both burden and kernel test using Cauchy combination test.

p.value_overall

Combined p-value of testing the overall association using Cauchy combination test.

References

Fan, Q., Sun, S., & Li, Y.‐J. (2021). Precisely modeling zero‐inflated count phenotype for rare variants. Genetic Epidemiology, 1–14.

Examples

data(Ex1_phenodata)
data(Ex1_genedata)
zimfrv(Ex1_phenodata,Ex1_genedata,weights = "Beta",max_maf = 0.02,model="zinb")


ZIM4rv documentation built on April 3, 2025, 7:39 p.m.