ClusteredSample-class: ClusteredSample: A class representing a clustered FC Sample
In flowMatch: Matching and meta-clustering in flow cytometry

Description Creating Object Slots Accessors Methods Author(s) See Also Examples

An object of class "ClusteredSample" represents a partitioning of a sample into clusters. We model a flow cytometry sample with a mixture of cell populations where a cell population is a normally distributed cluster. An object of class "ClusteredSample" therefore stores a list of clusters and other necessary parameters.

An object of class "ClusteredSample" can be created using the following constructor

ClusteredSample(labels, centers=list(), covs=list(), sample=NULL, sample.id=NA_integer_)

labels A vector of integers (from 1:num.clusters) indicating the cluster to which each point is allocated. This is usually obtained from a clustering algorithm.
centers A list of length num.clusters storing the centers of the clusters. The ith entry of the list centers[[i]] stores the center of the ith cluster. If not specified, the constructor estimates centers from sample.
covs A list of length num.clusters storing the covariance matrices of the clusters. The ith entry of the list cov[[i]] stores the covariance matrix of the ith cluster. If not specified, the constructor estimates cov from sample.
sample A matrix, data frame of observations, or object of class flowFrame. Rows correspond to observations and columns correspond to variables. It must be passed to the constructor if either centers or cov is unspecified; then centers or cov is estimated from sample.
sample.id The index of the sample (relative to other samples of a cohort).

An object of class "ClusteredSample" contains the following slots:

num.clusters: The number of clusters in the sample.
labels: A vector of integers (from range 1:num.clusters) indicating the cluster to which each point is assigned to. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.
dimension: Dimensionality of the sample (number of columns in data matrix).
clusters: A list of length num.clusters storing the cell populations. Each cluster is stored as an object of class Cluster.
size: Number of cells in the sample (summation of all cluster sizes).
sample.id: integer, denoting the index of the sample (relative to other samples of a cohort). Default is NA_integer_

All the slot accessor functions take an object of class ClusteredSample. I show usage of the first accessor function. Other functions can be called similarly.

get.size:

Returns the number of cells in the sample (summation of all cluster sizes).

Usage: get.size(object)

here object is a ClusteredSample object.

get.num.clusters

Returns the number of clusters in the sample.

get.labels

Returns the cluster labels for each cell. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.

get.dimension

Returns the dimensionality of the sample (number of columns in data matrix).

get.clusters

Returns the list of clusters in this sample. Each cluster is stored as an object of class Cluster.

get.sample.id

Returns the index of the sample (relative to other samples of a cohort).

show

Display details about the ClusteredSample object.

summary

Return descriptive summary for the ClusteredSample object.

Usage: summary(ClusteredSample)

plot

We plot a sample by bivariate scatter plots where different clusters are shown in different colors.

Usage:

plot(sample, ClusteredSample, ...)

the arguments of the plot function are:

sample: A matrix, data.frame or an object of class flowFrame representing an FC sample.
ClusteredSample: An object of class ClusteredSample storing the clustering of the sample.
... Other usual plotting related parameters.

Ariful Azad

Cluster

## ------------------------------------------------
## load data and retrieve a sample
## ------------------------------------------------

library(healthyFlowData)
data(hd)
sample = exprs(hd.flowSet[[1]])

## ------------------------------------------------
## cluster sample using kmeans algorithm
## ------------------------------------------------
km = kmeans(sample, centers=4, nstart=20)
cluster.labels = km$cluster

## ------------------------------------------------
## Create ClusteredSample object  (Option 1 )
## without specifying centers and covs
## we need to pass FC sample for paramter estimation
## ------------------------------------------------

clustSample = ClusteredSample(labels=cluster.labels, sample=sample)

## ------------------------------------------------
## Create ClusteredSample object  (Option 2)
## specifying centers and covs 
## no need to pass the sample
## ------------------------------------------------

centers = list()
covs = list()
num.clusters = nrow(km$centers)
for(i in 1:num.clusters)
{
  centers[[i]] = km$centers[i,]
  covs[[i]] = cov(sample[cluster.labels==i,])
}
# Now we do not need to pass sample
ClusteredSample(labels=cluster.labels, centers=centers, covs=covs)

## ------------------------------------------------
## Show summary and plot a clustered sample
## ------------------------------------------------

summary(clustSample)
plot(sample, clustSample)