estimate_density: Density Estimation

View source: R/estimate_density.R

estimate_densityR Documentation

Density Estimation

Description

This function is a wrapper over different methods of density estimation. By default, it uses the base R density with by default uses a different smoothing bandwidth ("SJ") from the legacy default implemented the base R density function ("nrd0"). However, Deng and Wickham suggest that method = "KernSmooth" is the fastest and the most accurate.

Usage

estimate_density(x, ...)

## S3 method for class 'data.frame'
estimate_density(
  x,
  method = "kernel",
  precision = 2^10,
  extend = FALSE,
  extend_scale = 0.1,
  bw = "SJ",
  ci = NULL,
  select = NULL,
  by = NULL,
  rvar_col = NULL,
  ...
)

## S3 method for class 'brmsfit'
estimate_density(
  x,
  method = "kernel",
  precision = 2^10,
  extend = FALSE,
  extend_scale = 0.1,
  bw = "SJ",
  effects = "fixed",
  component = "conditional",
  parameters = NULL,
  ...
)

Arguments

x

Vector representing a posterior distribution, or a data frame of such vectors. Can also be a Bayesian model. bayestestR supports a wide range of models (see, for example, methods("hdi")) and not all of those are documented in the 'Usage' section, because methods for other classes mostly resemble the arguments of the .numeric or .data.framemethods.

...

Currently not used.

method

Density estimation method. Can be "kernel" (default), "logspline" or "KernSmooth".

precision

Number of points of density data. See the n parameter in density.

extend

Extend the range of the x axis by a factor of extend_scale.

extend_scale

Ratio of range by which to extend the x axis. A value of 0.1 means that the x axis will be extended by 1/10 of the range of the data.

bw

See the eponymous argument in density. Here, the default has been changed for "SJ", which is recommended.

ci

The confidence interval threshold. Only used when method = "kernel". This feature is experimental, use with caution.

select

Character vector of column names. If NULL (the default), all numeric variables will be selected. Other arguments from datawizard::extract_column_names() (such as exclude) can also be used.

by

Optional character vector. If not NULL and input is a data frame, density estimation is performed for each group (subsets) indicated by by. See examples.

rvar_col

A single character - the name of an rvar column in the data frame to be processed. See example in p_direction().

effects

Should results for fixed effects ("fixed", the default), random effects ("random") or both ("⁠all"⁠) be returned? Only applies to mixed models. May be abbreviated.

component

Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, etc. See details in section Model Components. May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):

  • component = "all" returns all possible parameters.

  • If component = "location", location parameters such as conditional, zero_inflated, smooth_terms, or instruments are returned (everything that are fixed or random effects - depending on the effects argument - but no auxiliary parameters).

  • For component = "distributional" (or "auxiliary"), components like sigma, dispersion, beta or precision (and other auxiliary parameters) are returned.

parameters

Regular expression pattern that describes the parameters that should be returned. Meta-parameters (like lp__ or prior_) are filtered by default, so only parameters that typically appear in the summary() are returned. Use parameters to select specific parameters for the output.

Model components

Possible values for the component argument depend on the model class. Following are valid options:

  • "all": returns all model components, applies to all models, but will only have an effect for models with more than just the conditional model component.

  • "conditional": only returns the conditional component, i.e. "fixed effects" terms from the model. Will only have an effect for models with more than just the conditional model component.

  • "smooth_terms": returns smooth terms, only applies to GAMs (or similar models that may contain smooth terms).

  • "zero_inflated" (or "zi"): returns the zero-inflation component.

  • "location": returns location parameters such as conditional, zero_inflated, or smooth_terms (everything that are fixed or random effects - depending on the effects argument - but no auxiliary parameters).

  • "distributional" (or "auxiliary"): components like sigma, dispersion, beta or precision (and other auxiliary parameters) are returned.

For models of class brmsfit (package brms), even more options are possible for the component argument, which are not all documented in detail here. See also ?insight::find_parameters.

Note

There is also a plot()-method implemented in the see-package.

References

Deng, H., & Wickham, H. (2011). Density estimation in R. Electronic publication.

Examples


library(bayestestR)

set.seed(1)
x <- rnorm(250, mean = 1)

# Basic usage
density_kernel <- estimate_density(x) # default method is "kernel"

hist(x, prob = TRUE)
lines(density_kernel$x, density_kernel$y, col = "black", lwd = 2)
lines(density_kernel$x, density_kernel$CI_low, col = "gray", lty = 2)
lines(density_kernel$x, density_kernel$CI_high, col = "gray", lty = 2)
legend("topright",
  legend = c("Estimate", "95% CI"),
  col = c("black", "gray"), lwd = 2, lty = c(1, 2)
)

# Other Methods
density_logspline <- estimate_density(x, method = "logspline")
density_KernSmooth <- estimate_density(x, method = "KernSmooth")
density_mixture <- estimate_density(x, method = "mixture")

hist(x, prob = TRUE)
lines(density_kernel$x, density_kernel$y, col = "black", lwd = 2)
lines(density_logspline$x, density_logspline$y, col = "red", lwd = 2)
lines(density_KernSmooth$x, density_KernSmooth$y, col = "blue", lwd = 2)
lines(density_mixture$x, density_mixture$y, col = "green", lwd = 2)

# Extension
density_extended <- estimate_density(x, extend = TRUE)
density_default <- estimate_density(x, extend = FALSE)

hist(x, prob = TRUE)
lines(density_extended$x, density_extended$y, col = "red", lwd = 3)
lines(density_default$x, density_default$y, col = "black", lwd = 3)

# Multiple columns
head(estimate_density(iris))
head(estimate_density(iris, select = "Sepal.Width"))

# Grouped data
head(estimate_density(iris, by = "Species"))
head(estimate_density(iris$Petal.Width, by = iris$Species))

# rstanarm models
# -----------------------------------------------
library(rstanarm)
model <- suppressWarnings(
  stan_glm(mpg ~ wt + gear, data = mtcars, chains = 2, iter = 200, refresh = 0)
)
head(estimate_density(model))

library(emmeans)
head(estimate_density(emtrends(model, ~1, "wt", data = mtcars)))

# brms models
# -----------------------------------------------
library(brms)
model <- brms::brm(mpg ~ wt + cyl, data = mtcars)
estimate_density(model)



DominiqueMakowski/bayestestR documentation built on March 29, 2025, 4:17 p.m.