Description Usage Arguments Details Value Author(s) References Examples

This function fits 32 different continuous distributions by (weighted) NLS to the histogram of Monte Carlo simulation results as obtained by `propagate`

or any other vector containing large-scale observations. Finally, the fits are sorted by ascending `BIC`

.

1 2 |

`object` |
an object of class 'propagate' or a vector containing observations. |

`nbin` |
the number of bins in the histogram. |

`weights` |
numeric or logical. Either a numeric vector of weights, or if |

`verbose` |
logical. If |

`brute` |
complexity of the brute force approach. See 'Details'. |

`plot` |
if |

`distsel` |
a vector of distribution numbers to select from the complete cohort as listed below, e.g. |

`...` |
other parameters to be passed to the plots. |

Fits the following 32 distributions using (weighted) residual sum-of-squares as the minimization criterion for `nls.lm`

:

1) Normal distribution (`dnorm`

) => https://en.wikipedia.org/wiki/Normal_distribution

2) Skewed-normal distribution (`propagate:::dsn`

) => https://en.wikipedia.org/wiki/Skew_normal_distribution

3) Generalized normal distribution (`propagate:::dgnorm`

) => https://en.wikipedia.org/wiki/Generalized_normal_distribution

4) Log-normal distribution (`dlnorm`

) => https://en.wikipedia.org/wiki/Log-normal_distribution

5) Scaled and shifted t-distribution (`propagate:::dst`

) => GUM 2008, Chapter 6.4.9.2.

6) Logistic distribution (`dlogis`

) => https://en.wikipedia.org/wiki/Logistic_distribution

7) Uniform distribution (`dunif`

) => https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)

8) Triangular distribution (`propagate:::dtriang`

) => https://en.wikipedia.org/wiki/Triangular_distribution

9) Trapezoidal distribution (`propagate:::dtrap`

) => https://en.wikipedia.org/wiki/Trapezoidal_distribution

10) Curvilinear Trapezoidal distribution (`propagate:::dctrap`

) => GUM 2008, Chapter 6.4.3.1

11) Gamma distribution (`dgamma`

) => https://en.wikipedia.org/wiki/Gamma_distribution

12) Inverse Gamma distribution (`propagate:::dinvgamma`

) => https://en.wikipedia.org/wiki/Inverse-gamma_distribution

13) Cauchy distribution (`dcauchy`

) => https://en.wikipedia.org/wiki/Cauchy_distribution

14) Laplace distribution (`propagate:::dlaplace`

) => https://en.wikipedia.org/wiki/Laplace_distribution

15) Gumbel distribution (`propagate:::dgumbel`

) => https://en.wikipedia.org/wiki/Gumbel_distribution

16) Johnson SU distribution (`propagate:::dJSU`

) => https://en.wikipedia.org/wiki/Johnson_SU_distribution

17) Johnson SB distribution (`propagate:::dJSB`

) => https://www.mathwave.com/articles/johnson_sb_distribution.html

18) Three-parameter Weibull distribution (`propagate:::dweibull2`

) => https://en.wikipedia.org/wiki/Weibull_distribution

19) Two-parameter beta distribution (`dbeta2`

) => https://en.wikipedia.org/wiki/Beta_distribution#Two_parameters_2

20) Four-parameter beta distribution (`propagate:::dbeta2`

) => https://en.wikipedia.org/wiki/Beta_distribution#Four_parameters_2

21) Arcsine distribution (`propagate:::darcsin`

) => https://en.wikipedia.org/wiki/Arcsine_distribution

22) Von Mises distribution (`propagate:::dmises`

) => https://en.wikipedia.org/wiki/Von_Mises_distribution

23) Inverse Gaussian distribution (`propagate:::dinvgauss`

) => https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution

24) Generalized Extreme Value distribution (`propagate:::dgevd`

) => https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution

25) Rayleigh distribution (`propagate:::drayleigh`

) => https://en.wikipedia.org/wiki/Rayleigh_distribution

26) Chi-square distribution (`dchisq`

) => https://en.wikipedia.org/wiki/Chi-squared_distribution

27) Exponential distribution (`dexp`

) => https://en.wikipedia.org/wiki/Exponential_distribution

28) F-distribution (`df`

) => https://en.wikipedia.org/wiki/F-distribution

29) Burr distribution (`dburr`

) => https://en.wikipedia.org/wiki/Burr_distribution

30) Chi distribution (`dchi`

) => https://en.wikipedia.org/wiki/Chi_distribution

31) Inverse Chi-square distribution (`dinvchisq`

) => https://en.wikipedia.org/wiki/Inverse-chi-squared_distribution

32) Cosine distribution (`dcosine`

) => https://en.wikipedia.org/wiki/Raised_cosine_distribution

All distributions are fitted with a brute force approach, in which the parameter space is extended over three orders of magnitude *(0.1, 1, 10)\times β_i* when `brute = "fast"`

, or five orders *(0.01, 0.1, 1, 10, 100)\times β_i* when `brute = "slow"`

. Approx. 20-90s are needed to fit for the fast version, depending mainly on the number of bins.

The goodness-of-fit (GOF) is calculated with `BIC`

from the (weighted) log-likelihood of the fit:

*\rm{ln}(L) = 0.5 \cdot ≤ft(-N \cdot ≤ft(\rm{ln}(2π) + 1 + \rm{ln}(N) - ∑_{i=1}^n log(w_i) + \rm{ln}≤ft(∑_{i=1}^n w_i \cdot x_i^2\right) \right) \right)*

*\rm{BIC} = - 2\rm{ln}(L) + (N - k)ln(N)*

with *x_i* = the residuals from the NLS fit, *N* = the length of the residual vector, *k* = the number of parameters of the fitted model and *w_i* = the weights.

In contrast to some other distribution fitting softwares (i.e. Easyfit, Mathwave) that use residual sum-of-squares/Anderson-Darling/Kolmogorov-Smirnov statistics as GOF measures, the application of BIC accounts for increasing number of parameters in the distribution fit and therefore compensates for overfitting. Hence, this approach is more similar to ModelRisk (Vose Software) and as employed in `fitdistr`

of the 'MASS' package.
Another application is to identify a possible distribution for the raw data prior to using Monte Carlo simulations from this distribution. However, a decent number of observations should be at hand in order to obtain a realistic estimate of the proper distribution. See 'Examples'.

The code for the density functions can be found in file "distr-densities.R".

IMPORTANT: It can be feasible to set `weights = TRUE`

in order to give more weight to bins with low counts. See 'Examples'.
ALSO: Distribution fitting is highly sensitive to the number of defined histogram bins, so it is advisable to change this parameter and inspect if the order of fitted distributions remains stable.

A list with the following items:

`stat`

: the by BIC value ascendingly sorted distribution names, including RSS and MSE.

`fit`

: a list of the results from `nls.lm`

for each distribution model, also sorted ascendingly by BIC values.

`par`

: a list of the estimated parameters of the models in `fit`

.

`se`

: a list of the parameters' standard errors, calculated from the square root of the covariance matrices diagonals.

`dens`

: a list with all density function used for fitting, sorted as in `fit`

.

`bestfit`

: the best model in terms of lowest BIC.

`bestpar`

: the parameters of `bestfit`

.

`bestse`

: the parameters' standard errors of `bestfit`

.

`fitted`

: the fitted values of `bestfit`

.

`residuals`

: the residuals of `bestfit`

.

Andrej-Nikolai Spiess

Continuous univariate distributions, Volume 1.

Johnson NL, Kotz S and Balakrishnan N.

*Wiley Series in Probability and Statistics, 2.ed* (2004).

A guide on probability distributions.

R-forge distributions core team.

http://dutangc.free.fr/pub/prob/probdistr-main.pdf.

Univariate distribution relationships.

Leemis LM and McQueston JT.

*The American Statistician* (2008), **62**: 45-53.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ```
## Not run:
## Linear example, small error
## => Normal distribution.
EXPR1 <- expression(x + 2 * y)
x <- c(5, 0.01)
y <- c(1, 0.01)
DF1 <- cbind(x, y)
RES1 <- propagate(expr = EXPR1, data = DF1, type = "stat",
do.sim = TRUE, verbose = TRUE)
fitDistr(RES1)
## Ratio example, larger error
## => Gamma distribution.
EXPR2 <- expression(x/2 * y)
x <- c(5, 0.1)
y <- c(1, 0.02)
DF2 <- cbind(x, y)
RES2 <- propagate(expr = EXPR2, data = DF2, type = "stat",
do.sim = TRUE, verbose = TRUE)
fitDistr(RES2)
## Exponential example, large error
## => Log-Normal distribution.
EXPR3 <- expression(x^(2 * y))
x <- c(5, 0.1)
y <- c(1, 0.1)
DF3 <- cbind(x, y)
RES3 <- propagate(expr = EXPR3, data = DF3, type = "stat",
do.sim = TRUE, verbose = TRUE)
fitDistr(RES3)
## Rectangular input distributions result
## in Curvilinear Trapezoidal output distribution.
A <- runif(100000, 20, 25)
B <- runif(100000, 3, 3.5)
DF4 <- cbind(A, B)
EXPR4 <- expression(A + B)
RES4 <- propagate(EXPR4, data = DF4, type = "sim",
use.cov = FALSE, do.sim = TRUE)
fitDistr(RES4)
## Fitting with 1/counts as weights.
EXPR5 <- expression(x + 2 * y)
x <- c(5, 0.05)
y <- c(1, 0.05)
DF5 <- cbind(x, y)
RES5 <- propagate(expr = EXPR5, data = DF5, type = "stat",
do.sim = TRUE, verbose = TRUE, weights = TRUE)
fitDistr(RES5)
## Using only selected distributions.
EXPR6 <- expression(x + sin(y))
x <- c(5, 0.1)
y <- c(1, 0.2)
DF6 <- cbind(x, y)
RES6 <- propagate(expr = EXPR6, data = DF6, type = "stat",
do.sim = TRUE)
fitDistr(RES6, distsel = c(1:10, 15, 28))
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.