View source: R/pipeline_functions.R
draw.clustComp | R Documentation |
draw.clustComp
draws a table to show each sample's observed label vs. its predicted label.
Each row represents an observed label (e.g. one subgroup of disease), each column represents the predicted label created by classification algorithm (e.g K-means).
draw.clustComp(
pred_label,
obs_label,
strategy = "ARI",
use_col = TRUE,
low_K = 5,
highlight_clust = NULL,
main = NULL,
clust_cex = 1,
outlier_cex = 0.3
)
pred_label |
a vector of characters, the predicted labels created by classification (e.g K-means). |
obs_label |
a vector of characters, the observed labels annotated by phenotype data. |
strategy |
character, method to quantify the similarity between predicted labels vs. observed labels. Users can choose from "ARI (adjusted rand index)", "NMI (normalized mutual information)" and "Jaccard". Default is "ARI". |
use_col |
logical, If TRUE, the table will be colored. The more sample gathered in one table cell, the darker shade it has. Default is TRUE. |
low_K |
integer, a threshold of sample number to be shown in a single cell. If too many samples gathered in a single table cell, it will be challenging for eyes. By setting the value of this threshold, if the number of samples gathered in one table cell exceeded the threshold, only the number will be shown. Otherwise, all samples' names will be listed. Default is 5. |
highlight_clust |
a vector of characters, the predicted label need to be highlighted in the figure. |
main |
character, an overall title for the plot. |
clust_cex |
numeric, text size for the predicted label (column names). Default is 1. |
outlier_cex |
numeric, text size for the observed label (row names). Default is 0.3. |
The table provides more details about the side-by-side PCA biplot created by draw.emb.kmeans
.
The purpose is to find if any abnormal sample (outlier) exists. The darker the table cell is,
the more samples are gathered in the corresponding label.
Return a matrix of integers and a table for visualization. Rows are predicted label, columns are observed label. Integer is the number of samples gathered in the corresponding label.
network.par <- list()
network.par$out.dir.DATA <- system.file('demo1','network/DATA/',package = "NetBID2")
NetBID.loadRData(network.par=network.par,step='exp-QC')
mat <- Biobase::exprs(network.par$net.eset)
phe <- Biobase::pData(network.par$net.eset)
intgroup <- 'subgroup'
pred_label <- draw.emb.kmeans(mat=mat,all_k = NULL,
obs_label=get_obs_label(phe,intgroup),
kmeans_strategy='consensus')
draw.clustComp(pred_label,get_obs_label(phe,intgroup),outlier_cex=1,low_K=2,use_col=TRUE)
draw.clustComp(pred_label,get_obs_label(phe,intgroup),outlier_cex=1,low_K=2,use_col=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.