ClusteredSample: A class representing a clustered FC Sample

Share:

Description

An object of class "ClusteredSample" represents a partitioning of a sample into clusters. We model a flow cytometry sample with a mixture of cell populations where a cell population is a normally distributed cluster. An object of class "ClusteredSample" therefore stores a list of clusters and other necessary parameters.

Creating Object

An object of class "ClusteredSample" can be created using the following constructor

ClusteredSample(labels, centers=list(), covs=list(), sample=NULL, sample.id=NA_integer_)

  • labels A vector of integers (from 1:num.clusters) indicating the cluster to which each point is allocated. This is usually obtained from a clustering algorithm.

  • centers A list of length num.clusters storing the centers of the clusters. The ith entry of the list centers[[i]] stores the center of the ith cluster. If not specified, the constructor estimates centers from sample.

  • covs A list of length num.clusters storing the covariance matrices of the clusters. The ith entry of the list cov[[i]] stores the covariance matrix of the ith cluster. If not specified, the constructor estimates cov from sample.

  • sample A matrix, data frame of observations, or object of class flowFrame. Rows correspond to observations and columns correspond to variables. It must be passed to the constructor if either centers or cov is unspecified; then centers or cov is estimated from sample.

  • sample.id The index of the sample (relative to other samples of a cohort).

Slots

An object of class "ClusteredSample" contains the following slots:

num.clusters

The number of clusters in the sample.

labels

A vector of integers (from range 1:num.clusters) indicating the cluster to which each point is assigned to. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.

dimension

Dimensionality of the sample (number of columns in data matrix).

clusters

A list of length num.clusters storing the cell populations. Each cluster is stored as an object of class Cluster.

size

Number of cells in the sample (summation of all cluster sizes).

sample.id

integer, denoting the index of the sample (relative to other samples of a cohort). Default is NA_integer_

Accessors

All the slot accessor functions take an object of class ClusteredSample. I show usage of the first accessor function. Other functions can be called similarly.

get.size:

Returns the number of cells in the sample (summation of all cluster sizes).

Usage: get.size(object)

here object is a ClusteredSample object.

get.num.clusters

Returns the number of clusters in the sample.

get.labels

Returns the cluster labels for each cell. For example, labels[i]=j means that the ith element (cell) is assigned to the jth cluster.

get.dimension

Returns the dimensionality of the sample (number of columns in data matrix).

get.clusters

Returns the list of clusters in this sample. Each cluster is stored as an object of class Cluster.

get.sample.id

Returns the index of the sample (relative to other samples of a cohort).

Methods

show

Display details about the ClusteredSample object.

summary

Return descriptive summary for the ClusteredSample object.

Usage: summary(ClusteredSample)

plot

We plot a sample by bivariate scatter plots where different clusters are shown in different colors.

Usage:

plot(sample, ClusteredSample, ...)

the arguments of the plot function are:

  • sample: A matrix, data.frame or an object of class flowFrame representing an FC sample.

  • ClusteredSample: An object of class ClusteredSample storing the clustering of the sample.

  • ... Other usual plotting related parameters.

Author(s)

Ariful Azad

See Also

Cluster

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
## ------------------------------------------------
## load data and retrieve a sample
## ------------------------------------------------

library(healthyFlowData)
data(hd)
sample = exprs(hd.flowSet[[1]])

## ------------------------------------------------
## cluster sample using kmeans algorithm
## ------------------------------------------------
km = kmeans(sample, centers=4, nstart=20)
cluster.labels = km$cluster

## ------------------------------------------------
## Create ClusteredSample object  (Option 1 )
## without specifying centers and covs
## we need to pass FC sample for paramter estimation
## ------------------------------------------------

clustSample = ClusteredSample(labels=cluster.labels, sample=sample)

## ------------------------------------------------
## Create ClusteredSample object  (Option 2)
## specifying centers and covs 
## no need to pass the sample
## ------------------------------------------------

centers = list()
covs = list()
num.clusters = nrow(km$centers)
for(i in 1:num.clusters)
{
  centers[[i]] = km$centers[i,]
  covs[[i]] = cov(sample[cluster.labels==i,])
}
# Now we do not need to pass sample
ClusteredSample(labels=cluster.labels, centers=centers, covs=covs)

## ------------------------------------------------
## Show summary and plot a clustered sample
## ------------------------------------------------

summary(clustSample)
plot(sample, clustSample)