dist_sample: Sampling distribution

View source: R/dist_sample.R

dist_sampleR Documentation

Sampling distribution

Description

[Stable]

The sampling distribution represents an empirical distribution based on observed samples. It is useful for bootstrapping, representing posterior distributions from Markov Chain Monte Carlo (MCMC) algorithms, or working with any empirical data where the parametric form is unknown. Unlike parametric distributions, the sampling distribution makes no assumptions about the underlying data-generating process and instead uses the sample itself to estimate distributional properties. The distribution can handle both univariate and multivariate samples.

Usage

dist_sample(x)

Arguments

x

A list of sampled values. For univariate distributions, each element should be a numeric vector. For multivariate distributions, each element should be a matrix where columns represent variables and rows represent observations.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_sample.html

In the following, let X be a random variable with sample x_1, x_2, \ldots, x_n of size n.

Support: The observed range of the sample

Mean (univariate):

\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i

Mean (multivariate): Computed independently for each variable.

Variance (univariate):

s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2

Covariance (multivariate): The sample covariance matrix.

Skewness (univariate):

g_1 = \frac{\sqrt{n} \sum_{i=1}^{n} (x_i - \bar{x})^3}{\left(\sum_{i=1}^{n} (x_i - \bar{x})^2\right)^{3/2}} \left(1 - \frac{1}{n}\right)^{3/2}

Probability density function: Approximated numerically using kernel density estimation.

Cumulative distribution function (univariate):

F(q) = \frac{1}{n} \sum_{i=1}^{n} I(x_i \leq q)

where I(\cdot) is the indicator function.

Cumulative distribution function (multivariate):

F(\mathbf{q}) = \frac{1}{n} \sum_{i=1}^{n} I(\mathbf{x}_i \leq \mathbf{q})

where the inequality is applied element-wise.

Quantile function (univariate): The sample quantile, computed using the specified quantile type (see stats::quantile()).

Quantile function (multivariate): Marginal quantiles are computed independently for each variable.

Random generation: Bootstrap sampling with replacement from the empirical sample.

See Also

stats::density(), stats::quantile(), stats::cov()

Examples

# Univariate numeric samples
dist <- dist_sample(x = list(rnorm(100), rnorm(100, 10)))

dist
mean(dist)
variance(dist)
skewness(dist)
generate(dist, 10)

density(dist, 1)

# Multivariate numeric samples
dist <- dist_sample(x = list(cbind(rnorm(100), rnorm(100, 10))))
dimnames(dist) <- c("x", "y")

dist
mean(dist)
variance(dist)
generate(dist, 10)
quantile(dist, 0.4) # Returns the marginal quantiles
cdf(dist, matrix(c(0.3,9), nrow = 1))


distributional documentation built on June 11, 2026, 9:07 a.m.