# describe_distribution: Describe a distribution In datawizard: Easy Data Wrangling

## Description

This function describes a distribution by a set of indices (e.g., measures of centrality, dispersion, range, skewness, kurtosis).

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35``` ```describe_distribution(x, ...) ## S3 method for class 'numeric' describe_distribution( x, centrality = "mean", dispersion = TRUE, iqr = TRUE, range = TRUE, quartiles = FALSE, ci = NULL, iterations = 100, threshold = 0.1, verbose = TRUE, ... ) ## S3 method for class 'factor' describe_distribution(x, dispersion = TRUE, range = TRUE, verbose = TRUE, ...) ## S3 method for class 'data.frame' describe_distribution( x, centrality = "mean", dispersion = TRUE, iqr = TRUE, range = TRUE, quartiles = FALSE, include_factors = FALSE, ci = NULL, iterations = 100, threshold = 0.1, verbose = TRUE, ... ) ```

## Arguments

 `x` A numeric vector. `...` Additional arguments to be passed to or from methods. `centrality` The point-estimates (centrality indices) to compute. Character (vector) or list with one or more of these options: `"median"`, `"mean"`, `"MAP"` or `"all"`. `dispersion` Logical, if `TRUE`, computes indices of dispersion related to the estimate(s) (`SD` and `MAD` for `mean` and `median`, respectively). `iqr` Logical, if `TRUE`, the interquartile range is calculated (based on `stats::IQR()`, using `type = 6`). `range` Return the range (min and max). `quartiles` Return the first and third quartiles (25th and 75pth percentiles). `ci` Confidence Interval (CI) level. Default is `NULL`, i.e. no confidence intervals are computed. If not `NULL`, confidence intervals are based on bootstrap replicates (see `iterations`). If `centrality = "all"`, the bootstrapped confidence interval refers to the first centrality index (which is typically the median). `iterations` The number of bootstrap replicates for computing confidence intervals. Only applies when `ci` is not `NULL`. `threshold` For `centrality = "trimmed"` (i.e. trimmed mean), indicates the fraction (0 to 0.5) of observations to be trimmed from each end of the vector before the mean is computed. `verbose` Toggle warnings and messages. `include_factors` Logical, if `TRUE`, factors are included in the output, however, only columns for range (first and last factor levels) as well as n and missing will contain information.

## Value

A data frame with columns that describe the properties of the variables.

## Note

There is also a `plot()`-method implemented in the see-package.

## Examples

 ```1 2 3 4 5``` ```describe_distribution(rnorm(100)) data(iris) describe_distribution(iris) describe_distribution(iris, include_factors = TRUE, quartiles = TRUE) ```

datawizard documentation built on Oct. 4, 2021, 9:07 a.m.