compare-methods: Compare two cytometry profiles.

Description Usage Arguments Details Value

Description

Cytometry profiles contained in CELL, CLUSTER, or GATE objects can be compared using the 'compare()' function. Comparison results are stored in a RES object.

Comparisons can be performed between profiles of same types or between profiles of different types. In the default statistical approach:
* if the comparisons are performed on profiles of same type then profiles will be compared to identify similar profiles
* if the comparisons are performed on profiles of different types then profiles will be compared to identify included profiles

In the context of a similarity comparison, a distance is computed between each marker of the two profiles ($D_i$). Marker distances below a distance threshold, specified by the user, will correspond to a marker similarity success. Euclidean distance will be used when comparing cell profiles while the Kolmogorov-Smirnov distance will be used when comparing cluster or gate profiles. A weight can be associated to each marker, in order to modulate their importance. An aggregation of marker distances is performed using an exact binomial test where marker successes are considered as successful Bernoulli experiments. Thereby, the proportion of marker successes is compared to a probability of success ($P$) specified by the user. A aggregated distance ($D$), corresponding to the weigthed mean of marker distances, is additionaly returned.

In the context of an inclusion comparison, an inclusion assessment is performed for each marker of the two profiles. A cell profile marker is considered as included in a gate profile when its expression value is within the range of the marker boundaries. Similarly, a cell profile marker is considered as included in a cell cluster profile when its expression value is within the range of the marker cluster defined based on quantiles of marker expression densities. Finally, a cell cluster profile marker is considered as included in a gate profile when its expression boundaries is within the range of the marker gate. As for similarity comparisons, weights associated to each marker. The aggregation of marker inclusion is also performed using an exact binomial test.

Comparisons can be performed based on the whole set of common markers between the two profiles, or based on a subset of markers specified by the user. Moreover, markers can be weighted in the comparison procedure, via a MWEIGHTS object.

If only one object is provided to the 'compare()' function then the comparisons will be performed between all profiles of this object. If two objects are provided to the 'compare()' function then the comparisons will be performed between all possible pairs of profiles between these two objects.

Importantly, users can define their own function to perform the statistical comparisons of the profiles, using the 'method' parameter. Please refer to the user tutorial for more details about this feature.

Usage

1
2
3
4
5
6
7
8
9
compare(object1, object2, ...)

## S4 method for signature 'CLUSTER,missing'
compare(object1, mweights = NULL,
  method = "compare_default", method.params = NULL)

## S4 method for signature 'CLUSTER,CLUSTER'
compare(object1, object2, mweights = NULL,
  method = "compare_default", method.params = NULL)

Arguments

object1

a CELL, CLUSTER or GATE object

object2

a CELL, CLUSTER or GATE object

...

other parameters

mweights

a MWEIGHTS object specifying the markers to use in the comparison procedure with theirs associated weights

method

a function or a character specifying the name of a function to use when performing the statistical comparisons between the cytometry profiles

method.params

a named character list used to parametrize the comparison function (please see the details section)

Details

Different parameters can be defined, via the method.params named list, to specify the behaviour of the comparisons:
* the D.th parameter indicates the distance threshold
* the P parameter indicates the expected proportion of marker successes
* the nbcells.th parameter indicates the number of cells per cluster below which the marker expression density of a cell cluster profile will be approximated by a normal distribution
* the cluster.quantiles parameter indicates the quantiles that will define the marker expression ranges for the cell cluster profiles

In the case of comparisons between two cell profiles, the marker distances are calculated based on the Euclidean distance. The parameter 'D.th' is set to 1.50 by default and the parameter 'P' is set to 0.75 default.

In the case of comparisons between two cell cluster profiles, the marker distances are calculated based on the Kolmogorov-Smirnov distance. The parameter 'D.th' is set to 0.30 by default and the parameter 'P' is set to 0.75 default. The nbcells.th parameter indicates the number of cells per cluster below which the density will be approximated by a normal distribution (set to 50 by default)

In the case of comparisons between two gate profiles, gates are modeled by uniform distributions, and the marker distances are calculated based on the Kolmogorov-Smirnov distance. The parameter 'D.th' is set to 0.30 by default and the parameter 'P' is set to 0.75 default.

In the case of comparisons between a cell profile and a gate profile, a cell profile marker is considered as included in the gate profile when its expression value is within the range of the marker boundaries. The parameter 'P' is set to 0.75 default.

In the case of comparisons between a cell profile and cell cluster profile, a cell profile marker is considered as included in a cell cluster profile when its expression value is within the range of the marker cluster defined based on quantiles of marker expression densities. The parameter 'P' is set to 0.75 default. The 'cluster.quantiles' parameter indicates the quantiles that will define the marker expression ranges for the cell cluster profile (set to 0.10 and 0.90 by default).

In the case of comparisons between a cell cluster profile and gate profile, a cell cluster profile marker is considered as included in a gate profile when its expression boundaries is within the range of the marker gate. The parameter 'P' is set to 0.75 default. The 'cluster.quantiles' parameter indicates the quantiles that will define the marker expression ranges for the cell cluster profile (set to 0.10 and 0.90 by default).

Importantly, in the case of comparisons involving CLUSTER profiles, Hartigan's dip tests and InterQuartile Ranges (IQR) can be computed in order to estimate if the marker expression densities are unimodales with low spreads. The Hartigan's dip test p-value threshold and IQR threshold can be both parametrized using the 'dip.pvalue' and 'IQR.th' parameters. If a marker density do not respect these constraints, the distance is set to 1.

Value

a S4 object of class RES


tchitchek-lab/CytoCompare documentation built on May 31, 2019, 7:29 a.m.