View source: R/confidence_ellipse.R
confidence_ellipse | R Documentation |
Compute the coordinate points of confidence ellipses at a specified confidence level.
confidence_ellipse(
.data,
x,
y,
.group_by = NULL,
conf_level = 0.95,
robust = FALSE,
distribution = "normal"
)
.data |
data frame or tibble. |
x |
column name for the x-axis variable. |
y |
column name for the y-axis variable. |
.group_by |
column name for the grouping variable ( |
conf_level |
confidence level for the ellipse (0.95 by default). |
robust |
optional ( |
distribution |
optional ( |
The function computes the coordinates of the confidence ellipse based
on the specified confidence level and the provided data. It can handle both classical
and robust estimation methods, and it supports grouping by a factor variable.
The distribution
parameter controls the statistical approach used for ellipse
calculation. The "normal"
option uses the chi-square distribution quantile,
which is appropriate when working with very large samples.
Whereas the "hotelling"
option uses Hotelling's T² distribution quantile.
This approach accounts for uncertainty in estimating both mean and covariance
from sample data, producing larger ellipses that better reflect sampling uncertainty.
This is statistically more rigorous for smaller sample sizes where parameter
estimation uncertainty is higher.
The combination of distribution = "hotelling"
and robust = TRUE
offers the
most conservative and statistically rigorous approach, particularly recommended
for exploratory data analysis and when dealing with datasets that may
not meet ideal statistical assumptions. For very large samples, the default
settings (distribution = "normal"
, robust = FALSE
) may be sufficient, as
the differences between methods diminish with increasing sample size.
Data frame of the coordinates points.
Christian L. Goueguel
Raymaekers, J., Rousseeuw P.J. (2019). Fast robust correlation for high dimensional data. Technometrics, 63(2), 184-198.
Brereton, R. G. (2016). Hotelling’s T-squared distribution, its relationship to the F distribution and its use in multivariate space. Journal of Chemometrics, 30(1), 18–21.
# Data
data("glass", package = "ConfidenceEllipse")
# Confidence ellipse
ellipse <- confidence_ellipse(.data = glass, x = SiO2, y = Na2O)
ellipse_grp <- confidence_ellipse(
.data = glass,
x = SiO2,
y = Na2O,
.group_by = glassType
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.