Description Usage Arguments Value Examples
View source: R/kbranches-global.R
Clusters data on K-Branches (halflines) with a common center and calculates the corresponding GAP statistic
| 1 2 3 4 5 | 
| input_dat: | data frame of input data with rows=samles and cols=dimensions. | 
| Kappa: | number of clusters (halflines) | 
| Dmat: | matrix containing sample distances | 
| init_Kmeans: | if TRUE: initialize directions v1,...,vk using K-Means. FALSE: use directions of randomly selected samples | 
| c0: | initial value for the center of all half-lines | 
| Vmat: | matrix whose K rows are the direction vectors | 
| nstart_GAP: | number of initializations for clustering when calculating the GAP statistic | 
| nstart_kmeans: | number of initializations for Kmeans (when using Kmeans to initialize khalflines) | 
| B_GAP: | number of bootstrap datasets used to compute the GAP statistic, if NULL (default), it won't be computed | 
| fixed_center: | if not NULL, then K-halflines will run with the given center fixed | 
| medoids: | if TRUE, the medoids version of khalflines will be used (slower) | 
| silent: | set to FALSE to display messages (for debugging) | 
| silent_internal: | set to TRUE to display messages and plots of internal clustering functions (for debugging) | 
| show_plots: | if TRUE, the clustering will result be plotted | 
| show_lines: | if TRUE, show the halflines in the plot | 
| show_plots_GAP: | if TRUE, show the plots when performing clustering under the null distribution to calculate the GAP statistic (for debugging) | 
a list with elements:
- cluster: cluster assignment for each sample (numeric)
- Kappa: number of clusters (halflines)
- err: total clustering cost
- iters: total iterations of the algorithm
- c0: position (row index in input_dat) of the center sample
- Vmat: positions (row indices in input_dat) of the direction samples
- clust_counts: number (count) of samples in each of the clusters
- all_clustering_errors: vector of total clustering error for each of the nstart different initializations
- all_clusterings: total results for each of the nstart different initializations
- GAP: value of the modified GAP statistic for the given Kappa
- GAPl: value of the modified GAP statistic for the given Kappa using the logarithm of the expected dispersion
- GAP_orig: value of the oroginal GAP statistic for the given Kappa (using the logarithm of the expected dispersion)
- GAP_orig_no_log: value of the oroginal GAP statistic for the given Kappa (without using the logarithm of the expected dispersion)
- GAP.sd: standard deviation of GAP
- GAPl.sd: standard deviation of GAPl
- GAP_orig.sd: standard deviation of GAP_orig
- GAP_orig_no_log.sd: standard deviation of GAP_orig_no_log
- call: function call
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #cluster the 2D data on three halflines
set.seed(1)
#load the data
data(scdata.3lines.simulated6genes_subsampled)
raw_dat <- scdata.3lines.simulated6genes_subsampled
#perform diffusion map dimensionality reduction
dmap <- destiny::DiffusionMap(raw_dat, sigma = 1000)
#keep the first 2 diffusion components
input_dat <- destiny::as.data.frame(dmap)[, 1:2]
#cluster with K=3
clust <- kbranches.global(input_dat, Kappa = 3)
#plot the clustering results
plot(input_dat, pch=21, col=clust$cluster, bg=clust$cluster, main = 'K-Branch clustering')
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.