distee: Calculate distance between two gene expression data sets
In kbroman/lineup: Lining Up Two Sets of Measurements

distee

R Documentation

Calculate distance between two gene expression data sets

Description

Calculate a distance between all pairs of individuals for two gene expression data sets

Usage

distee(
  e1,
  e2 = NULL,
  d.method = c("rmsd", "cor"),
  labels = c("e1", "e2"),
  verbose = TRUE
)

Arguments

`e1`	Numeric matrix of gene expression data, as individuals x genes. The row and column names must contain individual and gene identifiers.
`e2`	(Optional) Like `e1`. An appreciable number of individuals and genes must be in common.
`d.method`	Calculate inter-individual distance as RMS difference or as correlation.
`labels`	Two character strings, to use as labels for the two data matrices in subsequent output.
`verbose`	if TRUE, give verbose output.

Details

We calculate the pairwise distance between all individuals (rows) in e1 and all individuals in e2. This distance is either the RMS difference (d.method="rmsd") or the correlation (d.method="cor").

Value

A matrix with nrow(e1) rows and nrow(e2) columns, containing the distances. The individual IDs are in the row and column names. The matrix is assigned class "lineupdist".

Author(s)

Karl W Broman, broman@wisc.edu

Examples


# load the example data
data(expr1, expr2)


# find samples in common
id <- findCommonID(expr1, expr2)

# calculate correlations between cols of x and cols of y
thecor <- corbetw2mat(expr1[id$first,], expr2[id$second,])

# subset at genes with corr > 0.8 and scale values
expr1s <- expr1[,thecor > 0.8]/1000
expr2s <- expr2[,thecor > 0.8]/1000

# calculate distance (using "RMS difference" as a measure)
d1 <- distee(expr1s, expr2s, d.method="rmsd", labels=c("1","2"))

# calculate distance (using "correlation" as a measure...really similarity)
d2 <- distee(expr1s, expr2s, d.method="cor", labels=c("1", "2"))

# pull out the smallest 8 self-self correlations
sort(pulldiag(d2))[1:8]

# summary of results
summary(d1)
summary(d2)

# plot histograms of RMS distances
plot(d1)

# plot histograms of correlations
plot(d2)

# plot distances against one another
plot2dist(d1, d2)

kbroman/lineup documentation built on July 19, 2024, 8:22 p.m.