# fitdistr: Maximum-likelihood Fitting of Univariate Distributions In MASS: Support Functions and Datasets for Venables and Ripley's MASS

 fitdistr R Documentation

## Maximum-likelihood Fitting of Univariate Distributions

### Description

Maximum-likelihood fitting of univariate distributions, allowing parameters to be held fixed if desired.

### Usage

``````fitdistr(x, densfun, start, ...)
``````

### Arguments

 `x` A numeric vector of length at least one containing only finite values. `densfun` Either a character string or a function returning a density evaluated at its first argument. Distributions `"beta"`, `"cauchy"`, `"chi-squared"`, `"exponential"`, `"gamma"`, `"geometric"`, `"log-normal"`, `"lognormal"`, `"logistic"`, `"negative binomial"`, `"normal"`, `"Poisson"`, `"t"` and `"weibull"` are recognised, case being ignored. `start` A named list giving the parameters to be optimized with initial values. This can be omitted for some of the named distributions and must be for others (see Details). `...` Additional parameters, either for `densfun` or for `optim`. In particular, it can be used to specify bounds via `lower` or `upper` or both. If arguments of `densfun` (or the density function corresponding to a character-string specification) are included they will be held fixed.

### Details

For the Normal, log-Normal, geometric, exponential and Poisson distributions the closed-form MLEs (and exact standard errors) are used, and `start` should not be supplied.

For all other distributions, direct optimization of the log-likelihood is performed using `optim`. The estimated standard errors are taken from the observed information matrix, calculated by a numerical approximation. For one-dimensional problems the Nelder-Mead method is used and for multi-dimensional problems the BFGS method, unless arguments named `lower` or `upper` are supplied (when `L-BFGS-B` is used) or `method` is supplied explicitly.

For the `"t"` named distribution the density is taken to be the location-scale family with location `m` and scale `s`.

For the following named distributions, reasonable starting values will be computed if `start` is omitted or only partially specified: `"cauchy"`, `"gamma"`, `"logistic"`, `"negative binomial"` (parametrized by `mu` and `size`), `"t"` and `"weibull"`. Note that these starting values may not be good enough if the fit is poor: in particular they are not resistant to outliers unless the fitted distribution is long-tailed.

There are `print`, `coef`, `vcov` and `logLik` methods for class `"fitdistr"`.

### Value

An object of class `"fitdistr"`, a list with four components,

 `estimate` the parameter estimates, `sd` the estimated standard errors, `vcov` the estimated variance-covariance matrix, and `loglik` the log-likelihood.

### Note

Numerical optimization cannot work miracles: please note the comments in `optim` on scaling data. If the fitted parameters are far away from one, consider re-fitting specifying the control parameter `parscale`.

### References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

### Examples

``````## avoid spurious accuracy
op <- options(digits = 3)
set.seed(123)
x <- rgamma(100, shape = 5, rate = 0.1)
fitdistr(x, "gamma")
## now do this directly with more control.
fitdistr(x, dgamma, list(shape = 1, rate = 0.1), lower = 0.001)

set.seed(123)
x2 <- rt(250, df = 9)
fitdistr(x2, "t", df = 9)
## allow df to vary: not a very good idea!
fitdistr(x2, "t")
## now do fixed-df fit directly with more control.
mydt <- function(x, m, s, df) dt((x-m)/s, df)/s
fitdistr(x2, mydt, list(m = 0, s = 1), df = 9, lower = c(-Inf, 0))

set.seed(123)
x3 <- rweibull(100, shape = 4, scale = 100)
fitdistr(x3, "weibull")

set.seed(123)
x4 <- rnegbin(500, mu = 5, theta = 4)
fitdistr(x4, "Negative Binomial")
options(op)
``````

MASS documentation built on May 4, 2023, 9:07 a.m.