TWLsample: Main function to obtain posterior samples from a TWL model.

Description Usage Arguments Value Examples

Description

Main function to obtain posterior samples from a TWL model.

Usage

1
2
3
TWLsample(full_dat_mat, full_dat, alpha_re = 7, beta_re = 0.4,
  num_its = 5000, num_all_clus = 30, output_every = 20, manip = TRUE,
  sav_inter = FALSE)

Arguments

full_dat_mat

list of matrices of the different data types.

full_dat

list of data.tables with a single column labelled 'nam', denoting sample annotation. A consistent naming convention of samples must be used across data types.

alpha_re

Hyperparameter for the dirichlet prior model within each data type, influencing sparsity of clusterings. A smaller number encourages fewer clusters. Defaults to 7 and should be chosen as a function of sample size.

beta_re

Hyperparameter for the dirichlet prior model across datatypes within each sample, influencing the degree to which each data type's sample cluster labels affect those of the other data types. Defaults to 0.4 and should be chosen as a function of the total number of data types being integrated in the analysis.

num_its

Number of iterations. Defaults to 5000.

num_all_clus

Ceiling on the number of clusters. Defaults to 30. Should be chosen as some factor greater (for example, 5), than maximum number of hypothesized clusters in the data types.

output_every

Frequency of sampling log statistics, reporting mixing, cluster distribution, and proportion of cluster sharing across data types. Defaults to once every 20 iterations.

manip

TRUE/FALSE for whether likelihood manipulation should be used to increase mixing in situations where cluster means are far from one another in Euclidean distance. This should not influence identified clusters nor parameters associated with them. Defaults to TRUE.

sav_inter

A logical indicating whether a temporary file of the samples should be written out in the working directory every 50 iterations. Allows for restarts when sampling is interrupted, and defaults to FALSE.

Value

A list of lists of data.tables. The list length is the number of iterations. The length of each element is the number of data types. The data.tables have 2 columns, sample annotation called ‘nam’ and cluster assignment called 'clus'.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
data(data_and_output)
## Not run: clus_save <- TWLsample(misaligned_mat,misaligned,output_every=50,num_its=5000,manip=FALSE)
outpu_new <- pairwise_clus(clus_save,BURNIN=2000)

## End(Not run)
post_analy_cor(outpu_new,c("title1","title2","title3","title4","title5"),
tempfile(),ords='none') 
clus_labs <- post_analy_clus(outpu_new,clus_save,c(2:6),rep(0.6,5),c("title1","title2",
"title3","title4","title5"),tempfile())
output_nest <- cross_dat_analy(clus_save,4900)

twl documentation built on May 2, 2019, 4:01 p.m.