Description Usage Arguments Value References See Also Examples
View source: R/rep_biclustermd.R
Repeat a biclustering to achieve a minimum SSE solution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | rep_biclustermd(
data,
nrep = 10,
parallel = FALSE,
ncores = 2,
col_clusters = floor(sqrt(ncol(data))),
row_clusters = floor(sqrt(nrow(data))),
miss_val = mean(data, na.rm = TRUE),
miss_val_sd = 1,
similarity = "Rand",
row_min_num = 5,
col_min_num = 5,
row_num_to_move = 1,
col_num_to_move = 1,
row_shuffles = 1,
col_shuffles = 1,
max.iter = 100
)
|
data |
Dataset to bicluster. Must to be a data matrix with only numbers and missing values in the data set. It should have row names and column names. |
nrep |
The number of times to repeat the biclustering. Default 10. |
parallel |
Logical indicating if the user would like to utilize the
|
ncores |
The number of cores to use if parallel computing. Default 2. |
col_clusters |
The number of clusters to partition the columns into. |
row_clusters |
The number of clusters to partition the rows into. |
miss_val |
Value or function to put in empty cells of the prototype matrix.
If a value, a random normal variable with sd = |
miss_val_sd |
Standard deviation of the normal distribution |
similarity |
The metric used to compare two successive clusterings. Can be "Rand" (default), "HA" for the Hubert and Arabie adjusted Rand index or "Jaccard". See RRand and for details. |
row_min_num |
Minimum row prototype size in order to be eligible to be chosen when filling an empty row prototype. Default is 5. |
col_min_num |
Minimum column prototype size in order to be eligible to be chosen when filling an empty row prototype. Default is 5. |
row_num_to_move |
Number of rows to remove from the sampled prototype to put in the empty row prototype. Default is 1. |
col_num_to_move |
Number of columns to remove from the sampled prototype to put in the empty column prototype. Default is 1. |
row_shuffles |
Number of times to shuffle rows in each iteration. Default is 1. |
col_shuffles |
Number of times to shuffle columns in each iteration. Default is 1. |
max.iter |
Maximum number of iterations to let the algorithm run for. |
A list of the minimum SSE biclustering, a vector containing the final SSE of each repeat, and the time it took the function to run.
Li, J., Reisner, J., Pham, H., Olafsson, S., and Vardeman, S. (2019) Biclustering for Missing Data. Information Sciences, Submitted
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | data("synthetic")
# 20 repeats without parallelization
repeat_bc <- rep_biclustermd(synthetic, nrep = 20,
col_clusters = 3, row_clusters = 2,
miss_val = mean(synthetic, na.rm = TRUE),
miss_val_sd = sd(synthetic, na.rm = TRUE),
col_min_num = 2, row_min_num = 2,
col_num_to_move = 1, row_num_to_move = 1,
max.iter = 10)
repeat_bc
autoplot(repeat_bc$best_bc)
plot(repeat_bc$rep_sse, type = 'b', pch = 20)
repeat_bc$runtime
# 20 repeats with parallelization over 2 cores
repeat_bc <- rep_biclustermd(synthetic, nrep = 20, parallel = TRUE, ncores = 2,
col_clusters = 3, row_clusters = 2,
miss_val = mean(synthetic, na.rm = TRUE),
miss_val_sd = sd(synthetic, na.rm = TRUE),
col_min_num = 2, row_min_num = 2,
col_num_to_move = 1, row_num_to_move = 1,
max.iter = 10)
repeat_bc$runtime
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.