dash: Dirichlet adaptive shrinkage of compositional data using dash

Description Usage Arguments Details Value Examples

View source: R/dash.R

Description

Given a matrix of compositional counts data, with samples along the rows and the categories of composition along columns, performs Bayesian adaptive shrinkage of the compositions to produce refined composition probs.

Usage

1
2
3
4
dash(comp_data, concentration = NULL, mode = NULL, optmethod = c("mixEM",
  "w_mixEM"), sample_weights = NULL, verbose = FALSE, bf = TRUE,
  pi_init = NULL, squarem_control = list(), dash_control = list(),
  reportcov = FALSE)

Arguments

comp_data,

a n by m matrix where n represents the sample and m represents the category of composition.

concentration

a vector of concentration scales for different Dirichlet compositions. Defaults to NULL, in which case, we append concentration values of Inf, 100, 50, 20, 10, 5, 2, 1, 0.5 and 0.1.

mode

An user defined mode/mean for the Dirichlet components. Defaults to equal means for all components.

optmethod

The method for performing optimization of the mixture proportions or grades of memberships for the different Dirichlet compositions. Can be either of EM ("mixEM") or weighted mixEM ("w_mixEM").

sample_weights

The weights of the samples for performing the optimization. Defaults to NULL, in which case the weight is same for each sample.

verbose

if TRUE, outputs messages tracking progress of the method.

bf

A boolean (TRUE/FALSE) variable denoting whether log bayes factor (with respect to category with smallest representation) is used in optimization or the loglikelihood. Defaults to FALSE.

pi_init

An initial starting value for the mixture proportions. Defaults to same proportion for all categories.

squarem_control

A list of control parameters for the SQUAREM/IP algorithm, default value is set to be control.default=list(K = 1, method=3, square=TRUE, step.min0=1, step.max0=1, mstep=4, kr=1, objfn.inc=1,tol=1.e-07, maxiter=5000, trace=FALSE).

dash_control

A list of control parameters for determining the concentrations and prior weights and fdr control parameters for dash fucntion.

reportcov

A boolean indicating whether the user wants to return the covariance and correlation structure of the posterior. Defaults to FALSE.

Details

The dash function provides a number of ways to perform Empirical Bayes shrinkage estimation on compositional data (counts).

The inputs to dash is a matrix of compositional counts with samples along rows and categories along columns. The method assumes that the compositional counts data is generated from an underlying composition probability vector, which follows a mixture of Dirichlet distributions centered at the user defined mode (which defaults to means for all categories being equal).

We assume that the component Dirichlet distributions in the mixture have varying degrees of concentration, varying from Inf (which is same as saying a point mass at the mode), and then from high to low values of concentration and even concentration values less than 1, which would represent spikes at the corners of the simplex.

The grades of memberships/ mixture proportions in different Dirichlet components are estimated and post-hoc measures - posterior mean, posterior weights, posterior center and corner probabilities etc are computed. The posterior mean is considered as the shrunk compositional probability.

Value

A list, including the following, fitted_pi: The fitted values of mixture proportions for Dirichlet components concentration: The concentration scales of the Dirichlet compositions prior: Prior strengths of Dirichlet components posterior_weights: Posterior weights of each sample on each category posterior component posmean: Posterior means of compositional probability from dash fit of each sample datamean: Original compositional probability of each sample poscov: Posterior covariance structure for each sample (if reportcov TRUE) poscor: Posterior correlation structure for each sample (if reportcov TRUE) center_prob_local: Posterior probability on Inf concentration Dirichlet component center_prob: Posterior probability on Dirichlet components with concentration less than fdr_bound corner_prob: Posterior probability on Dirichlet components with concentration less than 1

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mat <- rbind(c(5, 0, 2, 0),
             c(1, 1, 0, 1),
             c(100, 100, 50, 100),
             c(20, 50, 100, 10),
             c(10, 10, 200, 20),
             c(50, 54, 58, 53),
             c(1,1,1,3),
             c(2, 4, 1, 1))
out <- dash(mat, optmethod = "mixEM", verbose=TRUE)
out <- dash(mat, optmethod = "w_mixEM", verbose=TRUE)

kkdey/dashr documentation built on May 3, 2019, 9:38 p.m.