consensusDistance: Calculate a distribution of distances from a consensus...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Calculates the distance from a consensus for a series of pathway fingerprints, accounting only for significantly high or low (-1 or 1) pathways in the consensus

Usage

1
consensusDistance(consensus, fingerprintframe)

Arguments

consensus

consensus fingerprint

fingerprintframe

dataframe of sample fingerprints from which the distance will be calculated

Details

The consensus fingerprint can be calculated using consensusFingerprint or alternatively can be a single fingerprint vector

Value

A dataframe with rows corresponding to each sample contained in the fingerprintframe with the following columns

distance

Manhattan distance of sample from the consensus fingerprint, scaled by the maximum possible distance

pvalue

p-value representing the probabilty that the samples are not phenotypically matched. N.B. this is only valid when the fingerprint frame represents a sufficiently broad coverage of phenotypes, e.g. the GEO corpus. This p-value is based on an assumption that the distances are normally distributed

Author(s)

Gabriel Altschuler

References

Altschuler, G. M., O. Hofmann, I. Kalatskaya, R. Payne, S. J. Ho Sui, U. Saxena, A. V. Krivtsov, S. A. Armstrong, T. Cai, L. Stein and W. A. Hide (2013). "Pathprinting: An integrative approach to understand the functional basis of disease." Genome Med 5(7): 68.

See Also

consensusFingerprint

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
require(pathprintGEOData)
library(SummarizedExperiment)

# load  the data
data(SummarizedExperimentGEO)

ds = c("chipframe", "genesets",
    "pathprint.Hs.gs","platform.thresholds", "pluripotents.frame")
data(list = ds)

# extract part of the GEO.fingerprint.matrix and GEO.metadata.matrix
GEO.fingerprint.matrix = assays(geo_sum_data[,300000:350000])$fingerprint
GEO.metadata.matrix = colData(geo_sum_data[,300000:350000])

# free up space by removing the geo_sum_data object
remove(geo_sum_data)

# Extract common GSMs since we only loaded part of the geo_sum_data object
common_GSMs <- intersect(pluripotents.frame$GSM,colnames(GEO.fingerprint.matrix))

# search for pluripotent arrays
# create consensus fingerprint for pluripotent samples
pluripotent.consensus<-consensusFingerprint(
    GEO.fingerprint.matrix[,common_GSMs], threshold=0.9)

# calculate distance from the pluripotent consensus
geo.pluripotentDistance<-consensusDistance(
    pluripotent.consensus, GEO.fingerprint.matrix)

# plot histograms
par(mfcol = c(2,1), mar = c(0, 4, 4, 2))
geo.pluripotentDistance.hist<-hist(geo.pluripotentDistance[,"distance"],
    nclass = 50, xlim = c(0,1), main = "Distance from pluripotent consensus")
par(mar = c(7, 4, 4, 2))
hist(geo.pluripotentDistance[pluripotents.frame$GSM, "distance"],
    breaks = geo.pluripotentDistance.hist$breaks, xlim = c(0,1), 
    main = "", xlab = "above: all GEO, below: curated pluripotent samples")

hidelab/pathprint documentation built on May 17, 2019, 3:57 p.m.