FitGoMpool: Run Grade of Membership (GoM) model with multiple starting...

Description Usage Arguments Value References Examples

View source: R/FitGoMpool.R

Description

Fits grade of membership model FitGoM() to count data with multiple starting points and choose the best fit using BIC (Bayesian Information Criterion). the multiple starting points ensure that the output is more reliable.

Usage

1
2
FitGoMpool(data, K, tol = 0.1, burn_trials = 10, options = c("BF",
  "BIC"), path_rda = NULL, control = list())

Arguments

data

counts data N x G, with N, the number of samples along the rows and G, number of genes along columns.

K

the vector of clusters or topics to be fitted. Must be an integer, unlike in ]FitGom(). So you need to apply this function separately for each K.

tol

Tolerance value for GoM model absolute log posterior increase at successive iterations (set to 0.1 as default).

burn_trials

The number of trials with different starting points used.

options

the measure used to choose best fit, either "BF" or "BIC" measures can be used. BF is more trustworthy, but BIC can be used for better model comparison.

path_rda

The directory path for saving the GoM model output. If NULL, it will return the output to console.

control

Control parameters. Same as topics() function of maptpx package.

Value

Outputs the best GoM model fit output for cluster K and saves it at the directory path in path_rda if the latter is provided.

References

Matt Taddy. On Estimation and Selection for Topic Models. AISTATS 2012, JMLR W\&CP 22.

Pritchard, Jonathan K., Matthew Stephens, and Peter Donnelly. Inference of population structure using multilocus genotype data. Genetics 155.2 (2000): 945-959.

Examples

1
2
3
data("ex.counts")
out <- FitGoMpool(ex.counts, K=2, tol=100, burn_trials=3,
                   control=list(tmax=100))

kkdey/CountClust documentation built on Jan. 17, 2021, 5:32 p.m.