split-methods: Methods to split flowFrames and flowSets according to filters

split-methodsR Documentation

Methods to split flowFrames and flowSets according to filters


Divide a flow cytometry data set into several subset according to the results of a filtering operation. There are also methods available to split according to a factor variable.


The splitting operation in the context of flowFrames and flowSets is the logical extension of subsetting. While the latter only returns the events contained within a gate, the former splits the data into the groups of events contained within and those not contained within a particular gate. This concept is extremely useful in applications where gates describe the distinction between positivity and negativity for a particular marker.

The flow data structures in flowCore can be split into subsets on various levels:

flowFrame: row-wise splitting of the raw data. In most cases, this will be done according to the outcome of a filtering operation, either using a filter that identifiers more than one sub-population or by a logical filter, in which case the data is split into two populations: "in the filter" and "not in the filter". In addition, the data can be split according to a factor (or a numeric or character vector that can be coerced into a factor).

flowSet: can be either split into subsets of flowFrames according to a factor or a vector that can be coerced into a factor, or each individual flowFrame into subpopulations based on the filters or filterResults provided as a list of equal length.

Splitting has a special meaning for filters that result in multipleFilterResults or manyFilterResults, in which case simple subsetting doesn't make much sense (there are multiple populations that are defined by the gate and it is not clear which of those should be used for the subsetting operation). Accordingly, splitting of multipleFilterResults creates multiple subsets. The argument population can be used to limit the output to only one or some of the resulting subsets. It takes as values a character vector of names of the populations of interest. See the documentation of the different filter classes on how population names can be defined and the respective default values. For splitting of logicalFilterResults, the population argument can be used to set the population names since there is no reasonable default other than the name of the gate. The content of the argument prefix will be prepended to the population names and '+' or '-' are finally appended allowing for more flexible naming schemes.

The default return value for any of the split methods is a list, but the optional logical argument flowSet can be used to return a flowSet instead. This only applies when splitting flowFrames, splitting of flowSets always results in lists of flowSet objects.


flowFrame methods:

split(x = "flowFrame", f = "ANY", drop = "ANY")

Catch all input and cast an error if there is no method for f to dispatch to.

split(x = "flowFrame", f = "factor", drop = "ANY")

Split a flowFrame by a factor variable. Length of f should be the same as nrow(x), otherwise it will be recycled, possibly leading to undesired outcomes. The optional argument drop works in the usual way, in that it removes empty levels from the factor before splitting.

split(x = "flowFrame", f = "character", drop = "ANY")

Coerce f to a factor and split on that.

split(x = "flowFrame", f = "numeric", drop = "ANY")

Coerce f to a factor and split on that.

split(x = "flowFrame", f = "filter", drop = "ANY")

First applies the filter to the flowFrame and then splits on the resulting filterResult object.

split(x = "flowFrame", f = "logicalFilterResult", drop = "ANY")

Split into the two subpopulations (in and out of the gate). The optional argument population can be used to control the names of the results.

split(x = "flowFrame", f = "manyFilterResult", drop = "ANY")

Split into the several subpopulations identified by the filtering operation. Instead of returning a list, the additional logical argument codeflowSet makes the method return an object of class flowSet. The optional population argument takes a character vector indicating the subpopulations to use for splitting (as identified by the population name in the filterDetails slot).

split(x = "flowFrame", f = "multipleFilterResult", drop = "ANY")

Split into the several subpopulations identified by the filtering operation. Instead of returning a list, the additional logical argument codeflowSet makes the method return an object of class flowSet. The optional population argument takes a character vector indicating the subpopulations to use for splitting (as identified by the population name in the filterDetails slot). Alternatively, this can be a list of characters, in which case the populations for each list item are collapsed into one flowFrame.

flowSet methods:

split(x = "flowSet", f = "ANY", drop = "ANY")

Catch all input and cast an error if there is no method for f to dispatch to.

split(x = "flowSet", f = "factor", drop = "ANY")

Split a flowSet by a factor variable. Length of f needs to be the same as length(x). The optional argument drop works in the usual way, in that it removes empty levels from the factor before splitting.

split(x = "flowSet", f = "character", drop = "ANY")

Coerce f to a factor and split on that.

split(x = "flowSet", f = "numeric", drop = "ANY")

Coerce f to a factor and split on that.

split(x = "flowSet", f = "list", drop = "ANY")

Split a flowSet by a list of filterResults (as typically returned by filtering operations on a flowSet). The length of the list has to be equal to the length of the flowSet and every list item needs to be a filterResult of equal class with the same parameters. Instead of returning a list, the additional logical argument codeflowSet makes the method return an object of class flowSet. The optional population argument takes a character vector indicating the subpopulations to use for splitting (as identified by the population name in the filterDetails slot). Alternatively, this can be a list of characters, in which case the populations for each list item are collapsed into one flowFrame. Note that using the population argument implies common population names for allfilterResults in the list and there will be an error if this is not the case.


F Hahne, B. Ellis, N. Le Meur


qGate <- quadGate(filterId="qg", "FSC-H"=200, "SSC-H"=400)

## split a flowFrame by a filter that creates
## a multipleFilterResult
samp <- GvHD[[1]]
fres <- filter(samp, qGate)
split(samp, qGate)

## return a flowSet rather than a list
split(samp, fres, flowSet=TRUE)

## only keep one population
##split(samp, fres, population="FSC-Height+SSC-Height+")

## split the whole set, only keep two populations
##split(GvHD, qGate, population=c("FSC-Height+SSC-Height+",

## now split the flowSet according to a factor
split(GvHD, pData(GvHD)$Patient)

RGLab/flowCore documentation built on Aug. 26, 2024, 8:52 a.m.