quantileplot | R Documentation |
Creates a bivariate, smooth quantile plot. This is the central function of the quantileplot
package. This plot visualizes estimates of the marginal density of the predictor, the conditional density of the outcome at selected values of the predictor, and smooth curves showing quantiles of the outcome as smooth functions of the predictor. This package is described in greater depth by Lundberg, Lee, and Stewart (2021), which is a generalization of Lundberg and Stewart (2020). The statistical core of the package relies on the methods of Fasiolo et al. (2020).
quantileplot( formula, data, weights = NULL, quantiles = c(0.1, 0.25, 0.5, 0.75, 0.9), slice_n = 7, show_ci = FALSE, quantile_notation = "legend", xlab = NULL, ylab = NULL, x_data_range = NULL, y_data_range = NULL, x_axis_range = NULL, y_axis_range = NULL, x_breaks = NULL, y_breaks = NULL, x_labels = ggplot2::waiver(), y_labels = ggplot2::waiver(), x_bw = NULL, y_bw = NULL, truncation_notation = "label", credibility_level = 0.95, uncertainty_draws = NULL, inverse_transformation = NULL, granularity = 512, second_formula = NULL, argGam = NULL, previous_fit = NULL, ... )
formula |
A bivariate model formula (e.g. |
data |
Data frame containing the variables in |
weights |
String name for sampling weights, which are a column of |
quantiles |
Numeric vector containing quantiles to be estimated. Values should be between 0 and 1. |
slice_n |
Integer number of vertical slices (conditional densities of y given x) to be plotted. Default is 7. |
show_ci |
Logical, defaults to |
quantile_notation |
String, either |
xlab |
String x-axis title |
ylab |
String y-axis title |
x_data_range |
Numeric vector of length 2 containing the range of horizontal values to be plotted. Defaults to the range of the predictor variable in |
y_data_range |
Numeric vector of length 2 containing the range of vertical values to be plotted. Defaults to the range of the outcome variable in |
x_axis_range |
Numeric vector of length 2 for custom x-axis limits. This affects the plotting area but does not affect the data analyzed or displayed. To truncate the data, use |
y_axis_range |
Numeric vector of length 2 for custom y-axis limits. This affects the plotting area but does not affect the data analyzed or displayed. To truncate the data, use |
x_breaks |
Numeric vector of values for x-axis breaks. Alternatively, customize after producing the plot by modifying the resulting |
y_breaks |
Numeric vector of values for x-axis breaks. Alternatively, customize after producing the plot by modifying the resulting |
x_labels |
Vector of |
y_labels |
Vector of |
x_bw |
Numeric bandwidth for density estimation in the |
y_bw |
Numeric bandwidth for density estimation in the |
truncation_notation |
String, one of |
credibility_level |
Numeric probability value for credible intervals; default to 0.95 to produce 95 percent credible intervals. Only relevant if |
uncertainty_draws |
A whole number. If non-null, the number of simulated posterior draws to estimate for each smooth quantile curve. When used with the |
inverse_transformation |
A function of a scalar argument. Only used in the rare use case where the outcome has an extremely skewed distribution and the user wants to estimate the quantile curves on a transformed outcome, to be brought back to the original scale for the visualization. In that case, this argument is the function to convert from the transformed outcome back to the original scale. For instance, if the outcome in the model formula is |
granularity |
Integer number of points at which to evaluate each density. Defaults to 512, as in |
second_formula |
Model formula to allow the learning rate to change as a function of the predictor. This is passed to |
argGam |
Additional arguments to the GAM for model fitting. Passed to mqgam. |
previous_fit |
The result of a previous call to |
... |
Other arguments passed to |
An object of S3 class quantileplot
, which supports summary()
, print()
, and plot()
functions. The returned object has several elements.
plot
is a ggplot2
object. This contains the most basic plot. The user can customize this output by passing additional layers to quantileplot.out$plot
as they would for any ggplot2
object.
sim_curve_plots
is a list object of ggplot2
objects, one for each quantile curve, which shows the point estimate for the curve in black and a series of simulated posterior samples in gray.
densities
is a list of length four.
marginal
and conditional
are data frames containing the estimated marginal and conditional densities.
x_bw
and y_bw
are the bandwidths used for Gaussian kernel density estimation.
curves
is a data frame containing the estimated quantile curves.
mqgam.out
is the output from the call to the mqgam
function in the qgam
package, which is used to estimate the quantile curves.
x_data_range
and y_data_range
are the horizontal and vertical ranges of the plot.
slice_x_values
are the predictor values at which vertical conditional densities are estimated.
call
is the user's call that produced these results.
arguments
is a list of all the arguments to the function, including those specified by the user and those specified by defaults.
Lundberg, Ian, Robin C. Lee, and Brandon M. Stewart. 2021. "The quantile plot: A visualization for bivariate population relationships." Working paper.
Lundberg, Ian, and Brandon M. Stewart. 2020. "Comment: Summarizing income mobility with multiple smooth quantiles instead of parameterized means." Sociological Methodology 50(1):96-111.
Fasiolo, Matteo, Simon N. Wood, Margaux Zaffran, Raphaƫl Nedellec, and Yannig Goude. 2020. "Fast calibrated additive quantile regression." Journal of the American Statistical Association.
x <- rbeta(1000,1,2) y <- log(1 + 9 * x) * rbeta(1000, 1, 2) data <- data.frame(x = x, y = y) quantileplot(y ~ s(x), data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.