geometric_median: Geometric median

Description Usage Arguments Value Examples

View source: R/outlier-dist.R

Description

Compute the geometric median, i.e. the point that minimizes the sum of all Euclidean distances to the observations (rows of U).

Usage

1
geometric_median(U, tol = 1e-10, maxiter = 1000, by_grp = NULL)

Arguments

U

A matrix (e.g. PC scores).

tol

Convergence criterion. Default is 1e-10.

maxiter

Maximum number of iterations. Default is 1000.

by_grp

Possibly a vector for splitting rows of U into groups before computing the geometric mean for each group. Default is NULL (ignored).

Value

The geometric median of all rows of U, a vector of the same size as ncol(U). If providing by_grp, then a matrix with rows being the geometric median within each group.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
X <- readRDS(system.file("testdata", "three-pops.rds", package = "bigutilsr"))
pop <- rep(1:3, c(143, 167, 207))

svd <- svds(scale(X), k = 5)
U <- sweep(svd$u, 2, svd$d, '*')
plot(U, col = pop, pch = 20)

med_all <- geometric_median(U)
points(t(med_all), pch = 20, col = "blue", cex = 4)

med_pop <- geometric_median(U, by_grp = pop)
points(med_pop, pch = 20, col = "blue", cex = 2)

bigutilsr documentation built on April 14, 2021, 1:06 a.m.