fastmap: FastMap Projection

View source: R/fastmap.R

fastmapR Documentation

FastMap Projection

Description

The FastMap algorithm performs approximate multidimensional scaling (MDS) based on any distance function. It is faster and more efficient than traditional MDS algorithms, scaling as O(n) rather than O(n^2). FastMap accomplishes this by finding two distant pivot objects on some hyperplane for each projected dimension, and then projecting all other objects onto the line between these pivots.

Usage

# FastMap projection
fastmap(x, k = 3L, distfun = NULL,
	transpose = FALSE, niter = 3L, verbose = NA, ...)

## S3 method for class 'fastmap'
predict(object, newdata, ...)

# Distance functionals
rowDistFun(x, y, metric = "euclidean", p = 2, weights = NULL,
	verbose = NA, nchunks = NA, BPPARAM = bpparam(), ...)
colDistFun(x, y, metric = "euclidean", p = 2, weights = NULL,
	verbose = NA, nchunks = NA, BPPARAM = bpparam(), ...)

Arguments

x, y

A numeric matrix-like object.

k

The number of FastMap components to project.

distfun

The function of the form function(x, y, ...) used to generate a distance function of the form function(i) giving the distances between the ith object(s) in x and all objects in y.

transpose

A logical value indicating whether x should be considered transposed or not. This only used internally to indicate whether the input matrix is (P x N) or (N x P), and therefore extract the number of objects and their names.

niter

The maximum number of iterations for finding the pivots.

verbose

Should progress be printed for each iteration?

nchunks

The number of chunks to use.

...

Additional options passed to distfun.

object

An object inheriting from fastmap.

newdata

An optional data matrix to use for the prediction.

BPPARAM

An optional instance of BiocParallelParam. See documentation for bplapply.

metric

Distance metric to use when finding the nearest neighbors. Supported metrics include "euclidean", "maximum", "manhattan", and "minkowski".

p

The power for the Minkowski distance.

weights

A numeric vector of weights for the distance components if calculating weighted distances. For example, the weighted Euclidean distance is sqrt(sum(w * (x - y)^2)).

Details

The pivots are initialized randomly for each new dimension, so the selection of pivots (and therefore the resulting projection) can be sensitive to the random seed for some datasets.

A custom distance function can be passed via distfun. If not provided, then this defaults to rowDistFun() if transpose=FALSE or colDistFun() if transpose=TRUE.

If a custom function is passed, it should take the form function(x, y, ...), and it must return a function of the form function(i). The returned function should return the distances between the ith object(s) in x and all objects in y. rowDistFun() and colDistFun() are examples of functions that satisfy these properties.

Value

An object of class fastmap, with the following components:

  • x: The projected variable matrix.

  • sdev: The standard deviations of each column of the projected matrix x.

  • pivots: A matrix giving the indices of the pivots and the distances between them.

  • pivot.array: A subset of the original data matrix containing only the pivots.

  • distfun: The function used to generate the distance function.

Author(s)

Kylie A. Bemis

References

C. Faloutsos, and D. Lin. “FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets.” Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pp. 163 - 174, June 1995.

See Also

cmdscale, prcomp

Examples

register(SerialParam())
set.seed(1)

a <- matrix(sort(runif(500)), nrow=50, ncol=10)
b <- matrix(rev(sort(runif(500))), nrow=50, ncol=10)
x <- cbind(a, b)

fm <- fastmap(x, k=2)

kuwisdelu/matter documentation built on May 1, 2024, 5:17 a.m.