# distee: Calculate distance between two gene expression data sets In lineup: Lining Up Two Sets of Measurements

 distee R Documentation

## Calculate distance between two gene expression data sets

### Description

Calculate a distance between all pairs of individuals for two gene expression data sets

### Usage

```distee(
e1,
e2 = NULL,
d.method = c("rmsd", "cor"),
labels = c("e1", "e2"),
verbose = TRUE
)
```

### Arguments

 `e1` Numeric matrix of gene expression data, as individuals x genes. The row and column names must contain individual and gene identifiers. `e2` (Optional) Like `e1`. An appreciable number of individuals and genes must be in common. `d.method` Calculate inter-individual distance as RMS difference or as correlation. `labels` Two character strings, to use as labels for the two data matrices in subsequent output. `verbose` if TRUE, give verbose output.

### Details

We calculate the pairwise distance between all individuals (rows) in `e1` and all individuals in `e2`. This distance is either the RMS difference (`d.method="rmsd"`) or the correlation (`d.method="cor"`).

### Value

A matrix with `nrow(e1)` rows and `nrow(e2)` columns, containing the distances. The individual IDs are in the row and column names. The matrix is assigned class `"lineupdist"`.

### Author(s)

Karl W Broman, broman@wisc.edu

`pulldiag()`, `omitdiag()`, `summary.lineupdist()`, `plot2dist()`, `disteg()`, `corbetw2mat()`

### Examples

```
data(expr1, expr2)

# find samples in common
id <- findCommonID(expr1, expr2)

# calculate correlations between cols of x and cols of y
thecor <- corbetw2mat(expr1[id\$first,], expr2[id\$second,])

# subset at genes with corr > 0.8 and scale values
expr1s <- expr1[,thecor > 0.8]/1000
expr2s <- expr2[,thecor > 0.8]/1000

# calculate distance (using "RMS difference" as a measure)
d1 <- distee(expr1s, expr2s, d.method="rmsd", labels=c("1","2"))

# calculate distance (using "correlation" as a measure...really similarity)
d2 <- distee(expr1s, expr2s, d.method="cor", labels=c("1", "2"))

# pull out the smallest 8 self-self correlations
sort(pulldiag(d2))[1:8]

# summary of results
summary(d1)
summary(d2)

# plot histograms of RMS distances
plot(d1)

# plot histograms of correlations
plot(d2)

# plot distances against one another
plot2dist(d1, d2)

```

lineup documentation built on July 10, 2022, 5:05 p.m.