| plot_cdf | R Documentation |
Plots empirical cumulative distribution functions (ECDFs) separately for each unique value of a grouping variable, with support for vectorized plotting parameters. If no grouping variable is provided, plots a single ECDF.
plot_cdf(
formula,
y2 = NULL,
data = NULL,
order = NULL,
show.ks = TRUE,
show.quantiles = TRUE,
...
)
formula |
Two possible uses (similar to
|
y2 |
optional second variable when contrasting two variables |
data |
An optional data frame containing the variables in the formula.
If |
order |
Controls the order in which groups appear in the plot and legend.
Use |
show.ks |
Logical. If TRUE (default), shows Kolmogorov-Smirnov test results when there are exactly 2 groups. If FALSE, KS test results are not displayed. |
show.quantiles |
Logical. If TRUE (default), shows horizontal lines and results at 25th, 50th, and 75th percentiles when there are exactly 2 groups. If FALSE, quantile lines and results are not displayed. |
... |
Additional arguments passed to plotting functions. Can be single values
(applied to all groups) or vectors (applied element-wise to each group).
Common parameters include |
Invisibly returns a list containing:
ecdfs: A list of ECDF function objects, one per group. Each can be
called as a function to compute cumulative probabilities (e.g., result$ecdfs[[1]](5)
returns P(X <= 5) for group 1).
ks_test: (Only when exactly 2 groups) The Kolmogorov-Smirnov test result
comparing the two distributions. Access p-value with result$ks_test$p.value.
quantile_regression_25: (Only when exactly 2 groups) Quantile regression
model for the 25th percentile.
quantile_regression_50: (Only when exactly 2 groups) Quantile regression
model for the 50th percentile (median).
quantile_regression_75: (Only when exactly 2 groups) Quantile regression
model for the 75th percentile.
warnings: Any warnings captured during execution (if any).
# Basic usage with single variable (no grouping)
y <- rnorm(100)
plot_cdf(y)
# Basic usage with formula syntax and grouping
group <- rep(c("A", "B", "C"), c(30, 40, 30))
plot_cdf(y ~ group)
# With custom colors (scalar - same for all)
plot_cdf(y ~ group, col = "blue")
# With custom colors (vector - different for each group)
plot_cdf(y ~ group, col = c("red", "green", "blue"))
# Multiple parameters
plot_cdf(y ~ group, col = c("red", "green", "blue"), lwd = c(1, 2, 3))
# With line type and point character
plot_cdf(y ~ group, col = c("red", "green", "blue"), lty = c(1, 2, 3), lwd = 2)
# Using data frame
df <- data.frame(value = rnorm(100), group = rep(c("A", "B"), 50))
plot_cdf(value ~ group, data = df)
plot_cdf(value ~ group, data = df, col = c("red", "blue"))
# Compare two vectors
y1 <- rnorm(50)
y2 <- rnorm(50, mean = 1)
plot_cdf(y1, y2)
# Formula syntax without data (variables evaluated from environment)
widgetness <- rnorm(100)
gender <- rep(c("M", "F"), 50)
plot_cdf(widgetness ~ gender)
# Using the returned object
df <- data.frame(value = c(rnorm(50, 0), rnorm(50, 1)), group = rep(c("A", "B"), each = 50))
result <- plot_cdf(value ~ group, data = df)
# Use ECDF to find P(X <= 0.5) for group A
result$ecdfs[[1]](0.5)
# Access KS test p-value
result$ks_test$p.value
# Summarize median quantile regression
summary(result$quantile_regression_50)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.