tagmEM: Function to perform generalised t-augmented Gaussian mixture...

Description Usage Arguments Value

View source: R/tagmEM.R

Description

Function to perform generalised t-augmented Gaussian mixture modelling using expectation-maximisation

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
tagmEM(
  data,
  obs_names = NULL,
  max_iter = 5000,
  tol = 1e-05,
  cluster_sizes = NULL,
  sigk_thresh = 1e-05,
  junk_mixture = TRUE,
  df = 4,
  junk_mean = NULL,
  junk_sd = NULL,
  stop_bic_iter = 5,
  min_clust_search = 8,
  results_list = list("all", "best"),
  rand_sample = seq(0.05, 0.4, by = 0.05)
)

Arguments

data

numeric object of univariate observations

obs_names

character vector of length the number of observations

max_iter

numeric integer denoting the maximum number of iterations before stopping the EM-algorithm's search for a maxima in the log-likelihood.

tol

numeric scalar denoting the maximum absolute difference between two computations of the log-likelihood with which we accept that a maxima in the log-likelihood has been computed.

cluster_sizes

integer varying from 0 to number of data points N

sigk_thresh

lower bound of estimated cluster standard deviation. NOTE: avoids singular estimates of the 'sd'

junk_mixture

default is TRUE

df

default is 'df = 4'

junk_mean

numeric scalar denoting the mean of the generalised t-distribution. By default mean is set to zero.

junk_sd

numeric scalar denoting the scale parameter in the generalised t-distribution

stop_bic_iter

numeric integer I, for computational efficiency - particularly when analysing large numbers of variants - we can stop the EM-algorithm if the BIC is monotonic increasing over the previous I increases in the number of clusters K. By default evidence supporting at least 10 clusters in the data is computed and so, for example, if the BIC from models which assume 6 clusters; 7 clusters; ... or; 10 clusters is monotonic increasing - in the number of clusters K -then the EM-algorithm is stopped and the model whose K minimises the BIC is returned.

min_clust_search

numeric integer which denotes the minimum number of clusters searched for in the data - default computes evidence supporting up to K=10 clusters which might explain any clustered heterogeneity in the data.

results_list

character list allowing users to choose whether to return a table with the variants assigned to: "all" of the clusters; a single "best" cluster or; both. By default we return both, i.e. results_list = list("all", "best").

rand_sample

random probability of being assignment to junk cluster

Value

Returned are: estimates of the putative number of clusters in the sample, allocation probabilities and summaries of the association estimates for each observation;


cnfoley/tagmEM documentation built on Feb. 1, 2021, 12:23 a.m.