bag: Calculates the bag

bagR Documentation

Calculates the bag

Description

Calculates the bag of a gemplot (i.e. the inner gemstone).

Usage

bag(D, G)

Arguments

D

Data set with rows representing the individuals and columns representing the features. In the case of three dimensions, the colnames of D must be c("x", "y", "z").

G

List containing the grid information produced by gridfun and the halfspace location depths calculated by hldepth.

Details

Determines those grid points that belong to the bag, i.e. a convex hull that contains 50 percent of the data. In the case of a 3-dimensional data set, the bag can be visualized by an inner gemstone that can be accompanied by an outer gemstone (loop).

Value

A list containg the following elements:

coords

Coordinates of the grid points that belong to the bag. Each row represents a grid point and each column represents one dimension.

hull

A data matrix that contains the indices of the margin grid points of the bag that cover the convex hull by triangles. Each row represents one triangle. The indices correspond to the rows of coords.

Author(s)

Jochen Kruppa, Klaus Jung

References

Rousseeuw, P. J., Ruts, I., & Tukey, J. W. (1999). The bagplot: a bivariate boxplot. The American Statistician, 53(4), 382-387. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00031305.1999.10474494")}

Kruppa, J., & Jung, K. (2017). Automated multigroup outlier identification in molecular high-throughput data using bagplots and gemplots. BMC bioinformatics, 18(1), 1-10. https://link.springer.com/article/10.1186/s12859-017-1645-5

Examples

## Attention: calculation is currently time-consuming.
## Remove #-Symbols to run examples

## Two 3-dimensional example data sets D1 and D2
# n <- 200
# x1 <- rnorm(n, 0, 1)
# y1 <- rnorm(n, 0, 1)
# z1 <- rnorm(n, 0, 1)
# D1 <- data.frame(cbind(x1, y1, z1))
# x2 <- rnorm(n, 1, 1)
# y2 <- rnorm(n, 1, 1)
# z2 <- rnorm(n, 1, 1)
# D2 <- data.frame(cbind(x2, y2, z2))
# colnames(D1) <- c("x", "y", "z")
# colnames(D2) <- c("x", "y", "z")

## Placing outliers in D1 and D2
# D1[17,] = c(4, 5, 6)
# D2[99,] = -c(3, 4, 5)

## Grid size and graphic parameters
# grid.size <- 20
# red <- rgb(200, 100, 100, alpha = 100, maxColorValue = 255)
# blue <- rgb(100, 100, 200, alpha = 100, maxColorValue = 255)
# yel <- rgb(255, 255, 102, alpha = 100, maxColorValue = 255)
# white <- rgb(255, 255, 255, alpha = 100, maxColorValue = 255)
# require(rgl)
# material3d(color=c(red, blue, yel, white),
# alpha=c(0.5, 0.5, 0.5, 0.5), smooth=FALSE, specular="black")

## Calucation and visualization of gemplot for D1
# G <- gridfun(D1, grid.size=20)
# G$H <- hldepth(D1, G, verbose=TRUE)
# dm <- depmed(G)
# B <- bag(D1, G)
# L <- loop(D1, B, dm=dm)
# bg3d(color = "gray39" )
# points3d(D1[L$outliers==0,1], D1[L$outliers==0,2], D1[L$outliers==0,3], col="green")
# text3d(D1[L$outliers==1,1], D1[L$outliers==1,2],D1[L$outliers==1,3],
# as.character(which(L$outliers==1)), col=yel)
# spheres3d(dm[1], dm[2], dm[3], col=yel, radius=0.1)
# material3d(1,alpha=0.4)
# gem(B$coords, B$hull, red)
# gem(L$coords.loop, L$hull.loop, red)
# axes3d(col="white")

## Calucation and visualization of gemplot for D2
# G <- gridfun(D2, grid.size=20)
# G$H <- hldepth(D2, G, verbose=TRUE)
# dm <- depmed(G)
# B <- bag(D2, G)
# L <- loop(D2, B, dm=dm)
# points3d(D2[L$outliers==0,1], D2[L$outliers==0,2], D2[L$outliers==0,3], col="green")
# text3d(D2[L$outliers==1,1], D2[L$outliers==1,2],D2[L$outliers==1,3],
# as.character(which(L$outliers==1)), col=yel)
# spheres3d(dm[1], dm[2], dm[3], col=yel, radius=0.1)
# gem(B$coords, B$hull, blue)
# gem(L$coords.loop, L$hull.loop, blue)

RepeatedHighDim documentation built on July 9, 2023, 6:33 p.m.