Hierarchical Bayesian multiple regression model incorporating genotype uncertainty (HBMR) for binary traits

Share:

Description

The function implements HBMR using a Gibbs sampler with probit link function for binary traits.

Usage

1
2
hbmr_bin(pheno, geno, qi = matrix(), fam = 0, kin = matrix(), iter = 10000, burnin = 500,
gq = 20, imp = 0.1, cov = matrix(), maf = c(), pa = 1.3, pb = 0.04)

Arguments

pheno

A phenotypic vector (N x 1). The trait must be 0 or 1.

geno

An N x K genotypic data matrix, where N is the number of subjects and K is the number of rare variants. Genotypic value is only for dominant coding, i.e. 0 or 1. Plug in 0 for imputed genotypes.

qi

An optional N x K Genotypic quality matrix, where N is the number of subjects and K is the number of rare variants. If the genotype is sequenced, this must be an integer >=1 and is its GQ score in VCF file. If the genotype is imputed, this must be a value <1, and is its expected genotypic value based on the dominant coding.

fam

fam=1 for family samples. In this case, a relatedness matrix should be given. See kin.

kin

In the case of fam=1, kin is an N x N relatedness matrix. The scale of its entries are twice the kinship coefs, i.e. the same as that in coxme.

iter

The number of MCMC iterations. The default value is 10000.

burnin

The number of burn-ins. The default value is 500.

gq

A cutoff for GQ score (λ_Q). It should be an positive integer. If not specified, default value is 20. See the reference for more details.

imp

A cutoff for imputed genotype (λ_I). It should be a real number in (0,1). If not specified, default value is 0.1. See the reference for more details.

cov

An optional N x M covariate data matrix, where N is the number of subjects and M is the number of covariates.

maf

An optional minor allele frequency information vector (K x 1). If not specified, MAF will be estimated based on the genotype data.

pa

The positive hyper-parameter a in the gamma distribution of Bayesian shrinkage prior. The default value is 1.3.

pb

The positive hyper-parameter b in the gamma distribution of Bayesian shrinkage prior. The default value is 0.04.

Value

BF

The Bayes factor of δ=1 vs. δ=0

BF_RB

The BF estimated by using Rao-Blackwellization theorem

p_upper

For a BF larger than 2, we calculate p_upper that is the upper bound of the p value corresponding to the BF based on the connection BF<(-1)/(e*p*log(p)). The exact p value, which is smaller than p_upper, can be obtained through permutations.

mean

The mean of the posterior of β_0

var

The inverse of the mean of posterior of precision 1/σ

est_geno

The number of genotypes whose uncertainty are considered in estimation

var_ran

The estimated variance of the random effect for family design

rv_mean_es

The means of the posterior of γ for the K RVs

rv_sd_es

The standard deviations of the posterior of γ for the K RVs

mean_cov

The means of the posterior of for the M covariates

Author(s)

Liang He

References

He, L., Pitk<e4>niemi, J., Sarin, A. P., Salomaa, V., Sillanp<e4><e4>, M. J., & Ripatti, S. (2015). Hierarchical Bayesian Model for Rare Variant Association Analysis Integrating Genotype Uncertainty in Human Sequence Data. Genetic epidemiology, 39(2), 89-100.

Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American statistical Association, 88(422), 669-679.

Examples

1
2
3
data(hbmr_bin_data)
hbmr_bin(hbmr_bin_data$pheno[1:500], hbmr_bin_data$geno[1:500,1:3], fam=1, 
kin= hbmr_bin_data$kin[1:500,1:500], iter=800, burnin=200)