Description Usage Arguments Details Value Methods See Also Examples

The function computes pairwise distances between invididuals (e.g. samples or genes) according to a user-specified metric. Several metrics are available. The precise definition of each metric depends on the class of the first argument (see details section).

1 |

`x` |
Object for which we want to compute distances |

`metric` |
Desired distance metric. Valid options for chroGPS-factors map are 'tanimoto', 'avgdist', 'chisquare' and 'chi' (see details). For chroGPS-genes maps, metrics 'wtanimoto', 'euclidean' and 'manhattan' are also available. |

`weights` |
For signature(x='matrix'), an unnamed numeric vector with weights applied to every sample (column) in the original data. The typical example is when we have a sample (epigenetic factor) with several replicates available (biological or technical replicate, different antibody, etc.), and we want to treat them together (for instance giving a 1/nreplicates weight to each one). If not supplied, each replicate is considered as an individual sample (using 1 as weight for every sample). |

`uniqueRows` |
If set to |

`genomelength` |
For 'chi' and 'chisquare' metrics, numeric value indicating the length of the genome. If not given the function uses the minimum length necessary to fit the total length of the result. |

`mc.cores` |
If |

For `GRangesList`

objects, distances are defined as follows.

Let `a1`

and `a2`

be two `GRanges`

objects.
Define as `n1`

the number of `a1`

intervals overlapping with
some interval in `a2`

. Define `n2`

analogously.
The Tanimoto distance between `a1`

and `a2`

is defined as
`(n1+n2)/(nrow(z1)+nrow(z2))`

.
The average distance between `a1`

and `a2`

is defined as
`.5*(n1/nrow(z1) + n2/nrow(z2))`

.
The wtanimoto distance in chroGPS-genes weights each epigenetic factor
(table columns) according to its frequency (table rows).
The chi-square distance is defined as the usual chi-square distance on
a binary matrix `B`

which is automatically computed by
`distGPS`

.
The binary matrix `B`

is the
matrix with `length(x)`

rows and number of columns equal to the
genome length, where `B[i,j]==1`

indicates that element `i`

has a binding site at base pair `j`

.
The chi distance is simply defined as the square root of the
chi-square distance.
Finally, euclidean and manhattan metrics have the same definition than
in the base R function `dist`

.

When choosing a metric one should consider the effect of outliers, i.e. samples with large distance to all other samples. Tanimoto and Average Distance take values between 0 and 1, and therefore outlying distances have a limited effect. Chi-square and Chi distances are not limited between 0 and 1, i.e. some distances may be much larger than others. The Chi metric is slightly more robust to outliers than the Chi-square metric.

For `matrix`

or `data.frame`

objects, `x`

must be a
matrix with 0's and 1's (or `FALSE`

and `TRUE`

).
The usual definitions
are used for Tanimoto (which is equivalent to Jaccard's index),
Chi-square and Chi.
Average overlap between rows `i`

and `j`

is simply the
average between the proportion of elements in `i`

also in
`j`

and the proportion of elements in `j`

also in `i`

.

Object of class `distGPS`

, with matrix of pairwise
dissimilarities (distances) between objects.

distGPS:

- signature(x='GRangesList')
Each element in

`x`

is assumed to indicate the binding sites for a different sample, e.g. epigenetic factor. Typically`space(x)`

indicates the chromosome,`start(x)`

the start position and`end(x)`

the end position (in bp). Strand information is ignored.- signature(x='matrix')
Rows in

`x`

contain individuals for which we want to compute distances. Columns in`x`

contain the variables, and should only contain either 0's and 1's or`FALSE`

and`TRUE`

.

splitDistGPS:

This is a set of internal classes and functions to be used in the parallel computation of Multidimensional Scaling.

uniqueCount:

This function collapses a chroGPS-genes matrix or data frame so that elements with the same combination of variables are aggregated into a single entry. Elements become then identified by their unique pattern and a frequency count is also returned.

`mds`

to create MDS-oriented objects, `procrustesAdj`

for
Procrustes adjustment.

1 2 3 4 5 6 7 8 9 10 11 | ```
x <- rbind(c(rep(0,15),rep(1,5)),c(rep(0,15),rep(1,5)),c(rep(0,19),1),c(rep(1,5),rep(0,15)))
rownames(x) <- letters[1:4]
d <- distGPS(x,metric='tanimoto')
du <- distGPS(x,metric='tanimoto',uniqueRows=TRUE)
mds1 <- mds(d)
mds1
plot(mds1)
d <- distGPS(x,metric='chisquare')
mds1 <- mds(d)
mds1
plot(mds1)
``` |

Bioconductor-mirror/chroGPS documentation built on June 1, 2017, 5:32 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.