DensityScatter.DDCAL: Scatter density plot [Brinkmann et al., 2023]
In ScatterDensity: Density Estimation and Visualization of 2D Scatter Plots

DensityScatter.DDCAL

R Documentation

Scatter density plot [Brinkmann et al., 2023]

Description

Density estimation (PDE) [Ultsch, 2005] or "SDH" [Eilers/Goeman, 2004] used for a scatter density plot, with clustering of densities with DDCAL [Lux/Rinderle-Ma, 2023] proposed by [Brinkmann et al., 2023].

Usage

DensityScatter.DDCAL(X, Y, nClusters = 12, Plotter = "native", 
SDHorPDE = TRUE, LimitShownPoints = FALSE,
Marginals = FALSE, na.rm=TRUE, pch, Size, 
xlab="x", ylab="y", main = "",lwd = 2,
xlim=NULL,ylim=NULL,Polygon,BW = TRUE,Silent = FALSE, ...)

Arguments

`X`	Numeric vector [1:n], first feature (for x axis values)
`Y`	Numeric vector [1:n], second feature (for y axis values)
`nClusters`	(Optional) Integer defining the number of clusters (colors) used for finding a hard color transition, default is 12.
`Plotter`	(Optional) String, name of the plotting backend to use. Possible values are: "`native`", "`plotly`", or "`ggplot2`"
`SDHorPDE`	(Optional) Boolean, if TRUE SDH is used to calculate density, if FALSE PDE is used
`LimitShownPoints`	(Optional) FALSE: does nothing, TRUE: samples the number of optimal points for visualization using `SampleScatter`
`Marginals`	(Optional) Boolean, if TRUE the marginal distributions of X and Y will be plotted together with the 2D density of X and Y. Default is FALSE
`na.rm`	(Optional) Boolean, if TRUE non finite values will be removed
`pch`	(Optional) Scalar or character. Indicates the shape of data points, see `plot` function, `symbol` argument in plotly package, or the `shape` argument in ggplot2 package, default is `20` for `native` and for `ggplot2`, and `0` for `plotly`
`Size`	(Optional) Scalar, size of data points in plot, default is `1` for `native`, `6` for `plotly`, and `3` for `ggplot2`
`xlab`	(Optional) String, title of the x axis. Default: "X", see `plot()` function, or similar functonality in plotly or ggplot2
`ylab`	(Optional) String, title of the y axis. Default: "Y", see `plot()` function, or similar functonality in plotly or ggplot2
`main`	(Optional) Character, title of the plot.
`lwd`	(Optional) Scalar, thickness of the lines used for the marginal distributions (only needed if `Marginals=TRUE`), see `plot()`. Default = 2
`xlim`	(Optional) numerical vector, min and max of x values to be plottet
`ylim`	(Optional) numerical vector, min and max of y values to be plottet
`Polygon`	(Optional) [1:p,1:2] numeric matrix that defines for x and y coordinates a polygon in magenta
`BW`	(Optional) Boolean, if TRUE and `Plotter="ggplot2"` will use a white background, if FALSE and `Plotter="ggplot2"`, the typical ggplot2 background is used. Not needed if "`Plotter="native"`. Default is TRUE
`Silent`	(Optional) Boolean, if TRUE no messages will be printed, default is FALSE
`...`	Further plot arguments

Details

The DensityScatter.DDCAL function generates the density of the xy data as a z coordinate. Afterwards xyz will be plotted as a contour plot. It assumens that the cases of x and y are mapped to each other meaning that a cbind(x,y) operation is allowed. The colors for the densities in the contour plot are calculated with DDCAL, which produces clusters to evenly distribute the densities in low variance clusters.

In the case of "native" as Plotter, the handle returns NULL because the basic R functon plot() is used.

For the returned density values see SmoothedDensitiesXY or PDEscatter depending on input parameter SDHorPDE for details.

Value

returns a invisible list with

`DF`	[1:m,1:5] of `Density` values, `x` values, `y` values, `colors`, and classification vector `Cls`. m=n if `LimitShownPoints=FALSE`, otherwise `LimitShownPoints=TRUE` m<n meaning that subsample is taken
`PlotHandle`	the plotting handle, either an object of plotly, ggplot2 or NULL depending on input parameter `Plotter`

Author(s)

Luca Brinkmann, Michael Thrun

References

[Ultsch, 2005] Ultsch, A.: Pareto density estimation: A density estimation for knowledge discovery, In Baier, D. & Werrnecke, K. D. (Eds.), Innovations in classification, data science, and information systems, (Vol. 27, pp. 91-100), Berlin, Germany, Springer, 2005.

[Eilers/Goeman, 2004] Eilers, P. H., & Goeman, J. J.: Enhancing scatterplots with smoothed densities, Bioinformatics, Vol. 20(5), pp. 623-628. 2004.

[Lux/Rinderle-Ma, 2023] Lux, M. & Rinderle-Ma, S.: DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling, Journal of Classification vol. 40, pp. 106-144, 2023.

[Brinkmann et al., 2023] Brinkmann, L., Stier, Q., & Thrun, M. C.: Computing Sensitive Color Transitions for the Identification of Two-Dimensional Structures, Proc. Data Science, Statistics & Visualisation (DSSV) and the European Conference on Data Analysis (ECDA), p.109, Antwerp, Belgium, July 5-7, 2023.

Examples




# Create two bimodial distributions
x1=rnorm(n = 7500,mean = 0,sd = 1)
y1=rnorm(n = 7500,mean = 0,sd = 1)
x2=rnorm(n = 7500,mean = 2.5,sd = 1)
y2=rnorm(n = 7500,mean = 2.5,sd = 1)
x=c(x1,x2)
y=c(y1,y2)

DensityScatter.DDCAL(x, y, Marginals = TRUE)

ScatterDensity documentation built on April 15, 2025, 5:09 p.m.