R/1004-clusterStat.R

Defines functions clusterStat

Documented in clusterStat

#' generate statistics on sizes of clusters
#' 
#' 'cluster.sizestat' is used to do simple statistics on sizes of clusters
#' generated by 'clusterCMP'. It will return a dataframe which maps a cluster
#' size to  the number of clusters with that size. It is often used along
#' with 'cluster.visualize'.
#' 
#' 'cluster.sizestat' depends on the format that is returned by 'clusterCMP' - it
#' will treat the first column as the indecies, and the second column as the
#' cluster sizes of effective clustering. Because of this, when multiple
#' cutoffs are used when 'clusterCMP' is called, 'cluster.sizestat' will only
#' consider the clustering result of the first cutoff. If you want to work on
#' an alternative cutoff, you have to manually reorder/remove columns.
#' 
#' @param cls  The clustering result returned by 'clusterCMP'.
#'
#' @param cluster.result  If multiple cutoff values are used in clustering process, 
#'                        this argument tells which cutoff value is to be considered here.
#' 
#' @return Returns A data frame of two columns. 
#'         
#' @keywords clusterStat 
#'
#' @aliases clusterStat
#' 
#' @author Min-feng Zhu <\email{wind2zhu@@163.com}>
#' 
#' @export clusterStat
#' 
#' @references 
#'...
#' 
#' @examples
#' data(sdfbcl)
#' apbcl <- convSDFtoAP(sdfbcl)
#' 
#' cluster <- clusterCMP(db = apbcl, cutoff = c(0.65, 0.5))
#' clusterStat(cluster[, c(1, 2, 3)])
#' clusterStat(cluster[, c(1, 4, 5)])
#' 
clusterStat <- function(cls, cluster.result=1) {
    st <- data.frame(table(factor(cls[,cluster.result * 2])))
    # count clusters of each size
    st[,2] <- st[,2] / as.numeric(as.vector(st[,1]))
    names(st) <- c("cluster size", "count")
    return(st)
}
wind22zhu/BioMedR documentation built on Oct. 21, 2019, 12:51 p.m.