runGenphen: Genetic association analysis using Bayesian inference and...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/genphen.R

Description

Given a set of genotypes (single nucleotide polymorphisms - SNPs; or single amino acid polymorphisms - SAAPs) for a set of individuals, and a corresponding set of phenotypes, genphen quantifies the association between each genotype and phenotype using Bayesian inference and statistical learning.

Usage

1
2
3
runGenphen(genotype, phenotype, phenotype.type, model.type,
           mcmc.chains, mcmc.steps, mcmc.warmup, cores, 
           hdi.level, stat.learn.method, cv.steps, ...)

Arguments

genotype

Character matrix/data frame or a vector, containing SNPs/SAAPs as columns or alternatively as DNAMultipleAlignment or AAMultipleAlignment Biostrings object.

phenotype

Numerical vector (for a single phenotype) or matrix with multiple phenotypes stored as columns.

phenotype.type

Vector representing the type of each phenotype (of the phenotype input), with 'Q' identifier for quantitative, or 'D' for dichotomous phenotypes.

model.type

Type of Bayesian model: 'univariate' or 'hierarchical'

mcmc.chains

Number of MCMC chains (default = 2).

mcmc.steps

Length of MCMC chains (default = 1,000).

mcmc.warmup

Length of adaptive part of MCMC chains (default = 500).

cores

Number of cores to use (default = 1).

hdi.level

Highest density interval (HDI) (default = 0.95).

stat.learn.method

Parameter used to specify the statistical learning method used in the analysis. Currently two methods are available: random forest ('rf') and support vector machine ('svm'). For no statistical learning select 'none'.

cv.steps

cross-validation steps (default = 1,000).

...

Optional parameters include adapt_delta: STAN configuration (default = 0.9); max_treedepth: STAN configuration (default = 10); ntree: Number of random forest trees to grow, only in case stat.learn.method = 'rf' (default = 1000); cv.fold: Cross-validation fold (default = 0.66).

Details

Input:

Metrics: To quantify the association between each genotype and phenotype genphen computes multiple measures of association:

Value

General parameters:

site

id of the site (e.g. position in the provided sequence alignment)

ref, alt

reference and alternative genotypey

refN, altN

count of ref and alt genotypes

phenotype.id

Identifier of the studied phenotype

Association scores:

beta.mean, beta.se, beta.sd, beta.hdi.low/beta.hdi.high

Estimates of the mean, standard error, standard deviation and HDI of the slope coefficient

ca.mean, ca.hdi.low/ca.hdi.high

CA estimate and HDI

kappa.mean, kappa.hdi.low/kappa.hdi.high

Cohen's kappa and HDI

rank

Pareto optimiazion based front (rank) of SNP/SAAP estimated by maximizing metrics beta.mean and kappa.mean

MCMC convergence parameters:

Neff

Effective sampling size

Rhat

Potential scale reduction factor

Posterior predictions:

ppc

Posterior prediction check and real data summary for each genotype.

Posterior summary:

complete.posterior

Complete stan object containing the posterior of each parameter estimated during the Bayesian inference. The data can be used for model debugging, posterior predictive checks, etc.

Author(s)

Simo Kitanovski <simo.kitanovski@uni-due.de>

See Also

runDiagnostics, runPhyloBiasCheck

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# genotypes:
data(genotype.saap)
# quantitative phenotype:
data(phenotype.saap)
# dichotomous phenotype:
data(dichotomous.phenotype.saap)
# make phenotype matrix (column = phenotype)
phenotypes <- cbind(phenotype.saap, dichotomous.phenotype.saap)

# run genphen
out <- runGenphen(genotype = genotype.saap[, 80:82],
                  phenotype = phenotypes,
                  phenotype.type = c("Q", "D"),
                  model.type = "univariate",
                  mcmc.chains = 4,
                  mcmc.steps = 1500,
                  mcmc.warmup = 500,
                  cores = 2,
                  hdi.level = 0.95,
                  stat.learn.method = "rf",
                  cv.steps = 200)

genphen documentation built on Nov. 8, 2020, 5:03 p.m.