LBL: Logistic Bayesian Lasso for Detecting Rare (and Common)...

Description Usage Arguments Value See Also Examples

View source: R/LBLcac.R

Description

LBL is a Bayesian LASSO method developed to detect association between common/rare haplotypes and dichotomous disease phenotype, based on MCMC algorithm. This function will handle independent case/control study design. For other types of study designs, see famLBL and cLBL. This function takes standard pedigree format as input with an individual's genotypes, phenotype and familiar relationships. The input does not allow missing observations, and therefore subjects with missing data are removed. This function returns an object containing posterior samples after the burn-in period.

Usage

1
2
3
4
LBL(data.cac, baseline = "missing", a = 15, b = 15,
  start.beta = 0.01, lambda = 1, D = 0, seed = NULL,
  burn.in = 10000, num.it = 40000, summary = T, e = 0.1,
  ci.level = 0.95)

Arguments

data.cac

Input data. data.cac should be either a data frame or a matrix, consisting of "n" rows and 6+2*p columns, where n is the number of cases and controls, and p is the number of SNPs. The data should be in standard pedigree format, with the first 6 columns representing the family ID, individual ID, father ID, mother ID, sex, and affection status. The other 2*p columns are genotype data in allelic format, with each allele of a SNP taking up one column. An example can be found in this package under the name "cac". For more information about the format, type "?cac" into R, or see "Linkage Format" section of https://www.broadinstitute.org/haploview/input-file-formats. Note that since these are independent case-control data, the father ID and mother ID are missing (coded as 0) and each individual has an unique family ID.

baseline

Haplotype to be used for baseline coding; default is the most frequent haplotype according to the initial haplotype frequency estimates. This argument should be a character, starting with an h and followed by the SNPs at each marker locus, for example, if the desired baseline haplotype is 0 1 1 0 0, then baseline should be coded as "h01100".

a

First hyperparameter of the prior for regression coefficients, β. The prior variance of β is 2/λ^2 and λ has Gamma(a,b) prior. The Gamma prior parameters a and b are formulated such that the mean and variance of the Gamma distribution are a/b and a/b^2. The default value of a is 15.

b

Second hyperparameter of the Gamma(a,b) distribution described above; default is 15.

start.beta

Starting value of all regression coefficients, β; default is 0.01.

lambda

Starting value of the λ parameter described above; default is 1.

D

Starting value of the D parameter, which is the within-population inbreeding coefficient; default is 0.

seed

Seed to be used for the MCMC in Bayesian Lasso; default is a random seed. If exact same results need to be reproduced, seed should be fixed to the same number.

burn.in

Burn-in period of the MCMC sampling scheme; default is 10000.

num.it

Total number of MCMC iterations including burn-in; default is 40000.

summary

Logical. If TRUE, LBL will return a summary of the analysis in the form of a list. If FALSE, LBL will return the posterior samples for all parameters. Default is set to be TRUE.

e

A (small) number ε in the null hypothesis of no association, H_0: |β| ≤ ε. The default is 0.1. Changing e from the default of 0.1 may necessitate choosing a different threshold for Bayes Factor (one of the outputs) to infer association. Only used if summary = TRUE.

ci.level

Credible probability. The probability that the true value of beta will be within the credible interval. Default is 0.95, which corresponds to a 95% posterior credible interval. Only used if summary = TRUE.

Value

If summary = FALSE, return a list with the following components:

haplotypes

The list of haplotypes used in the analysis. The last column is the reference haplotype.

beta

Posterior samples of betas stored in a matrix.

lambda

A vector of (num.it-burn.in) posterior samples of lambda.

freq

Posterior samples of the frequencies of haplotypes stored in a matrix format, in the same order as haplotypes.

init.freq

The haplotype distribution used to initiate the MCMC.

If summary = TRUE, return the result of LBL_summary. For details, see the description of the LBL_summary function.

See Also

famLBL, cLBL, LBL_summary, print_LBL_summary, LBL-package.

Examples

1
2
3
4
 data(cac)
 cac.obj<-LBL(cac)
 cac.obj
 print_LBL_summary(cac.obj)

mxw010/LBL documentation built on Sept. 26, 2021, 3:44 a.m.