| dist_sample | R Documentation |
The sampling distribution represents an empirical distribution based on observed samples. It is useful for bootstrapping, representing posterior distributions from Markov Chain Monte Carlo (MCMC) algorithms, or working with any empirical data where the parametric form is unknown. Unlike parametric distributions, the sampling distribution makes no assumptions about the underlying data-generating process and instead uses the sample itself to estimate distributional properties. The distribution can handle both univariate and multivariate samples.
dist_sample(x)
x |
A list of sampled values. For univariate distributions, each element should be a numeric vector. For multivariate distributions, each element should be a matrix where columns represent variables and rows represent observations. |
We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_sample.html
In the following, let X be a random variable with sample
x_1, x_2, \ldots, x_n of size n.
Support: The observed range of the sample
Mean (univariate):
\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i
Mean (multivariate): Computed independently for each variable.
Variance (univariate):
s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2
Covariance (multivariate): The sample covariance matrix.
Skewness (univariate):
g_1 = \frac{\sqrt{n} \sum_{i=1}^{n} (x_i - \bar{x})^3}{\left(\sum_{i=1}^{n} (x_i - \bar{x})^2\right)^{3/2}} \left(1 - \frac{1}{n}\right)^{3/2}
Probability density function: Approximated numerically using kernel density estimation.
Cumulative distribution function (univariate):
F(q) = \frac{1}{n} \sum_{i=1}^{n} I(x_i \leq q)
where I(\cdot) is the indicator function.
Cumulative distribution function (multivariate):
F(\mathbf{q}) = \frac{1}{n} \sum_{i=1}^{n} I(\mathbf{x}_i \leq \mathbf{q})
where the inequality is applied element-wise.
Quantile function (univariate): The sample quantile, computed using
the specified quantile type (see stats::quantile()).
Quantile function (multivariate): Marginal quantiles are computed independently for each variable.
Random generation: Bootstrap sampling with replacement from the empirical sample.
stats::density(), stats::quantile(), stats::cov()
# Univariate numeric samples
dist <- dist_sample(x = list(rnorm(100), rnorm(100, 10)))
dist
mean(dist)
variance(dist)
skewness(dist)
generate(dist, 10)
density(dist, 1)
# Multivariate numeric samples
dist <- dist_sample(x = list(cbind(rnorm(100), rnorm(100, 10))))
dimnames(dist) <- c("x", "y")
dist
mean(dist)
variance(dist)
generate(dist, 10)
quantile(dist, 0.4) # Returns the marginal quantiles
cdf(dist, matrix(c(0.3,9), nrow = 1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.