gtm: Grouped Tissue Model
In anthony-aylward/asepirinen: Assessing allele-specific expression across multiple tissues from RNA-seq read data

Description Usage Arguments Details Value

Gibbs sampler to classify tissues into three (or two) groups.

gtm(y, pr.beta = c(2000, 2000, 36, 12, 80, 1), pr.intv = rep(NA, 6),
  pr.p0 = 0.75, pr.dist = NULL, niter = 2000, burnin = 10,
  two.sided = TRUE, independent = FALSE, model.strong.ase = TRUE,
  group.distance = c(1, 1, 0.5))

`y`	m x 2 matrix, of read counts for the two alleles for 'm' tissues (use rownames to identify tissues)
`pr.beta`	numeric, specifications for beta prior distributions
`pr.intv`	vector, the three or two intervals (depending on 'model.strong.ase') on which priors for thetas are truncated. If NA, then priors are not truncated.
`pr.p0`	numeric, sum of the prior probability of combined configurations C1,C2, and C3, (see above). if 'model.strong.ase == TRUE' then each of these three configurations has the same prior probability of 'pr.p0 / 3'. if 'model.strong.ase == FALSE' then configurations C1 and C2 have the same prior probability of 'pr.p0 / 2'.
`pr.dist`	numeric vector of prior probability assigned for each set of combined configurations with fixed distance (>0) from homogeneity.
`niter`	integer, the number of Gibbs sampling iterations
`burnin`	integer, number of initial iterations discarder, not included in niter (so sampler runs niter+burnin iters in total)
`independent`	logical, if TRUE then each tissue has its own theta, independent of the other tissues (given the priors above) if FALSE (default) then all tissues in the same group have the same theta
`model.strong.ase`	logical, if TRUE then the tissues can come from three groups, if FALSE from only two
`group.distance`	numeric, a vector of length 3 that tells the distances between groups G0 & G1, G0 & G2 and G1 & G2, respectively these are used to estimate the distances between tissues when 'model.strong.ase == TRUE'. If 'model.strong.ase == FALSE', then 'group.distance' is ignored and distance between groups G0 and G1 is 1.

Matti Pirinen 1-Apr-2014, 29-Dec-2014

The default is that we use three groups to specify groups for NOASE, MODASE and SNGASE, G0, NOASE: No allele specific expression (ASE), theta is close to 0.5 G1, MODASE: Moderate ASE, theta is different from 0.5 and different from extreme values of 0 and 1 G2, SNGASE, Strong ASE, theta is extreme, near 0 or 1 Priors for allele frequencies in the groups are given by (truncated) Beta distributions explained below. It is possible to restrict the model to only two groups: NOASE and MODASE.

The combined configuration is classified into 6 states: C1=NOASE (all tissues in G0), C2=MODASE (all tissues in G1), C3=SNGASE (all tissues in G2) C4=HET0 (heterogeneous with at least one tissue in G0) C5=HET1 (heterogeneous with no tissue in G0) C6=TIS_SPE (tissue specific, one tissue is in different group than all the rest which are in a same group) States C1,...,C5 are non-overlapping and exhaustive, while C6 is a subset of C5. Prior probabilities for states are determined by 'pr.p0' and 'pr.dist' explained below. If only two groups are used, then only combined configurations C1,C2,C4 and C6 are possible.

If model.strong.ase==TRUE then the tissues can come from three groups: theta is the frequency of allele in column 1 of y If two.sided is FALSE then G0: theta~Beta(pr.beta[1],pr.beta[2])*I(pr.intv[1],pr.intv[2]) or if pr.beta[1]==pr.beta[2]==NULL then theta=0.5 (point mass at 0.5) G1: theta~Beta(pr.beta[3],pr.beta[4])*I(pr.intv[3],pr.intv[4]) G2: theta~Beta(pr.beta[5],pr.beta[6])*I(pr.intv[5],pr.intv[6]) if two.sided is TRUE then G0: theta~0.5*Beta(pr.beta[1],pr.beta[2])*I(pr.intv[1],pr.intv[2])+0.5*Beta(pr.beta[2],pr.beta[1])*I(pr.intv[2],pr.intv[1]) or if pr.beta[1]==pr.beta[2]==NULL then theta=0.5 G1: theta~0.5*Beta(pr.beta[3],pr.beta[4])*I(pr.intv[3],pr.intv[4])+0.5*Beta(pr.beta[4],pr.beta[3])*I(pr.intv[4],pr.intv[3]) G2: theta~0.5*Beta(pr.beta[5],pr.beta[6])*I(pr.intv[5],pr.intv[6])+0.5*Beta(pr.beta[6],pr.beta[5])*I(pr.intv[6],pr.intv[5]) If model.strong.ase==FALSE then the tissues come from only two groups: G0 and G1. In this case 'pr.beta' and 'pr.intv' can be of length 4, or if they are of length 6 then they are truncated to subvector 1:4.

The values of 'pr.dist' are Interpreted as relative probabilities and normalised to sum up to (1-pr.p0). Distance of a configuration is the smallest number of tissues whose labels need to be changed to turn the configuration into a homogeneous one. (E.g. (0,1,1) and (2,1,2) have d=1 and (0,1,2) has d=2.) Each configuration (with distance > 0) will have a prior probability pr.dist[d]/(No. of configurations with distance=d). Thus two configs with the same distance are equally probable a priori. If model.strong.ase==TRUE then 'pr.dist' should have a length of m-ceiling(m/3). If model.strong.ase==FALSE then 'pr.dist' can have a length of m-ceiling(m/2) or if it has length m-ceiling(m/3) then the last elements are ignored. If pr.dist==NULL, then each distance is given the same prior probability (1-pr.p0)/(m-ceiling(m/3)) or (1-pr.p0)/(m-ceiling(m/2)) depending whether 'model.strong.ase' is TRUE or FALSE, respectively. Note that 'pr.p0' is the sum of prior probabilities assigned to the configurations with distance 0 from homogeneity and it is given as a separate parameter, not as part of pr.dist vector.

log10bfs configurations: 'NOASE', for C1, NOASE, this is always 0 'TOPHET' for configuration 'top.het.model' 'MODASE' for C2, MODASE (all in G1) 'SNGASE' for C3, SNGASE (all in G2) 'HET0' for C4, HET0 (at least one tissue in G0 and one tissue not in G0) 'HET1' for C5, HET1 (none of the tissues in G0 and at least one tissue in G1 and one in G2) 'TIS_SPE' for C6, TIS_SPE (one tissue is different from the others which are in the same group, i.e. configs that have d=1)

parameters: a list of input parameters
indiv.posteriors: 3 x m matrix for posterior probabilities of each tissue (col) belonging to each group (rows). row 1 is for G0 (NOASE), row 2 is for G1 (MODASE) and row 3 is for G2 (SNGASE)
distances: an m x m matrix of posterior mean distances between any two tissues when distance between two tissues is as given by group.distances
top.het.model: the vector of group labels for (a best guess of) the configuration that has maximum likelihood
log10bfs: log10 of Bayes factors against the null model (=C1, all tissues in NOASE group G0)
state.posteriors: posterior probabilities for the states C1,...,C6, named as in 'log10bfs'

anthony-aylward/asepirinen documentation built on May 13, 2019, 11:29 a.m.

anthony-aylward/asepirinen index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

anthony-aylward/asepirinen
Assessing allele-specific expression across multiple tissues from RNA-seq read data

gtm: Grouped Tissue Model
In anthony-aylward/asepirinen: Assessing allele-specific expression across multiple tissues from RNA-seq read data

Description

Usage

Arguments

Details

Value

Related to gtm in anthony-aylward/asepirinen...

R Package Documentation

Browse R Packages

We want your feedback!

anthony-aylward/asepirinen Assessing allele-specific expression across multiple tissues from RNA-seq read data

gtm: Grouped Tissue Model In anthony-aylward/asepirinen: Assessing allele-specific expression across multiple tissues from RNA-seq read data

Description

Usage

Arguments

Details

Value

Related to gtm in anthony-aylward/asepirinen...

R Package Documentation

Browse R Packages

We want your feedback!

anthony-aylward/asepirinen
Assessing allele-specific expression across multiple tissues from RNA-seq read data

gtm: Grouped Tissue Model
In anthony-aylward/asepirinen: Assessing allele-specific expression across multiple tissues from RNA-seq read data