mledist  R Documentation 
Fit of univariate distributions using maximum likelihood for censored or non censored data.
mledist(data, distr, start = NULL, fix.arg = NULL, optim.method = "default",
lower = Inf, upper = Inf, custom.optim = NULL, weights = NULL, silent = TRUE,
gradient = NULL, checkstartfix=FALSE, ...)
data 
A numeric vector for non censored data or
a dataframe of two columns respectively named 
distr 
A character string 
start 
A named list giving the initial values of parameters of the named distribution or a function of data computing initial values and returning a named list. This argument may be omitted (default) for some distributions for which reasonable starting values are computed (see details). 
fix.arg 
An optional named list giving the values of fixed parameters of the named distribution or a function of data computing (fixed) parameter values and returning a named list. Parameters with fixed value are thus NOT estimated by this maximum likelihood procedure. 
optim.method 

lower 
Left bounds on the parameters for the 
upper 
Right bounds on the parameters for the 
custom.optim 
a function carrying the MLE optimisation (see details). 
weights 
an optional vector of weights to be used in the fitting process.
Should be 
silent 
A logical to remove or show warnings when bootstraping. 
gradient 
A function to return the gradient of the loglikelihood for the 
checkstartfix 
A logical to test starting and fixed values. Do not change it. 
... 
further arguments passed to the 
This function is not intended to be called directly but is internally called in
fitdist
and bootdist
when used with the maximum likelihood method
and fitdistcens
and bootdistcens
.
It is assumed that the distr
argument specifies the distribution by the
probability density function and the cumulative distribution function (d, p).
The quantile function and the random generator function (q, r) may be
needed by other function such as mmedist
, qmedist
, mgedist
,
fitdist
,fitdistcens
, bootdistcens
and bootdist
.
For the following named distributions, reasonable starting values will
be computed if start
is omitted (i.e. NULL
) : "norm"
, "lnorm"
,
"exp"
and "pois"
, "cauchy"
, "gamma"
, "logis"
,
"nbinom"
(parametrized by mu and size), "geom"
, "beta"
, "weibull"
from the stats
package;
"invgamma"
, "llogis"
, "invweibull"
, "pareto1"
, "pareto"
,
"lgamma"
, "trgamma"
, "invtrgamma"
from the actuar
package.
Note that these starting values may not be good enough if the fit is poor.
The function uses a closedform formula to fit the uniform distribution.
If start
is a list, then it should be a named list with the same names as in
the d,p,q,r functions of the chosen distribution.
If start
is a function of data, then the function should return a named list with the same names as in
the d,p,q,r functions of the chosen distribution.
The mledist
function allows user to set a fixed values for some parameters.
As for start
, if fix.arg
is a list, then it should be a named list with the same names as in
the d,p,q,r functions of the chosen distribution.
If fix.arg
is a function of data, then the function should return a named list with the
same names as in the d,p,q,r functions of the chosen distribution.
When custom.optim=NULL
(the default), maximum likelihood estimations
of the distribution parameters are computed with the R base optim
or constrOptim
.
If no finite bounds (lower=Inf
and upper=Inf
) are supplied,
optim
is used with the method specified by optim.method
.
Note that optim.method="default"
means optim.method="NelderMead"
for distributions
with at least two parameters and optim.method="BFGS"
for distributions with only one parameter.
If finite bounds are supplied (among lower
and upper
) and gradient != NULL
,
constrOptim
is used.
If finite bounds are supplied (among lower
and upper
) and gradient == NULL
,
constrOptim
is used when optim.method="NelderMead"
;
optim
is used when optim.method="LBFGSB"
or "Brent"
;
in other case, an error is raised (same behavior as constrOptim
).
When errors are raised by optim
, it's a good idea to start by adding traces during
the optimization process by adding control=list(trace=1, REPORT=1)
.
If custom.optim
is not NULL
, then the usersupplied function is used
instead of the R base optim
. The custom.optim
must have (at least)
the following arguments
fn
for the function to be optimized, par
for the initialized parameters.
Internally the function to be optimized will also have other arguments,
such as obs
with observations and ddistname
with distribution name for non censored data (Beware of potential conflicts with optional
arguments of custom.optim
). It is assumed that custom.optim
should carry
out a MINIMIZATION.
Finally, it should return at least the following components par
for the estimate,
convergence
for the convergence code, value
for fn(par)
,
hessian
, counts
for the number of calls (function and gradient)
and message
(default to NULL
) for the error message
when custom.optim
raises an error,
see the returned value of optim
.
See examples in fitdist
and fitdistcens
.
Optionally, a vector of weights
can be used in the fitting process.
By default (when weigths=NULL
), ordinary MLE is carried out, otherwise
the specified weights are used to balance the loglikelihood contributions.
It is not yet possible to take into account weights in functions plotdist
,
plotdistcens
, plot.fitdist
, plot.fitdistcens
, cdfcomp
,
cdfcompcens
, denscomp
, ppcomp
, qqcomp
, gofstat
,
descdist
, bootdist
, bootdistcens
and mgedist
.
(developments planned in the future).
NB: if your data values are particularly small or large, a scaling may be needed before the optimization process. See Example (7).
mledist
returns a list with following components,
estimate 
the parameter estimates. 
convergence 
an integer code for the convergence of

value 
the minimal value reached for the criterion to minimize. 
hessian 
a symmetric matrix computed by 
optim.function 
the name of the optimization function used for maximum likelihood. 
optim.method 
when 
fix.arg 
the named list giving the values of parameters of the named distribution
that must kept fixed rather than estimated by maximum likelihood or 
fix.arg.fun 
the function used to set the value of 
weights 
the vector of weigths used in the estimation process or 
counts 
A twoelement integer vector giving the number of calls
to the loglikelihood function and its gradient respectively.
This excludes those calls needed to compute the Hessian, if requested,
and any calls to loglikelihood function to compute a finitedifference
approximation to the gradient. 
optim.message 
A character string giving any additional information
returned by the optimizer, or 
loglik 
the loglikelihood value. 
method 

MarieLaure DelignetteMuller and Christophe Dutang.
Venables WN and Ripley BD (2002), Modern applied statistics with S. Springer, New York, pp. 435446.
DelignetteMuller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 134.
mmedist
, qmedist
, mgedist
,
fitdist
,fitdistcens
for other estimation methods,
optim
, constrOptim
for optimization routines,
bootdistcens
and bootdist
for bootstrap,
and llplot
for plotting the (log)likelihood.
# (1) basic fit of a normal distribution with maximum likelihood estimation
#
set.seed(1234)
x1 < rnorm(n=100)
mledist(x1,"norm")
# (2) defining your own distribution functions, here for the Gumbel distribution
# for other distributions, see the CRAN task view dedicated to probability distributions
dgumbel < function(x,a,b) 1/b*exp((ax)/b)*exp(exp((ax)/b))
mledist(x1,"gumbel",start=list(a=10,b=5))
# (3) fit of a discrete distribution (Poisson)
#
set.seed(1234)
x2 < rpois(n=30,lambda = 2)
mledist(x2,"pois")
# (4) fit a finitesupport distribution (beta)
#
set.seed(1234)
x3 < rbeta(n=100,shape1=5, shape2=10)
mledist(x3,"beta")
# (5) fit frequency distributions on USArrests dataset.
#
x4 < USArrests$Assault
mledist(x4, "pois")
mledist(x4, "nbinom")
# (6) fit a continuous distribution (Gumbel) to censored data.
#
data(fluazinam)
log10EC50 <log10(fluazinam)
# definition of the Gumbel distribution
dgumbel < function(x,a,b) 1/b*exp((ax)/b)*exp(exp((ax)/b))
pgumbel < function(q,a,b) exp(exp((aq)/b))
qgumbel < function(p,a,b) ab*log(log(p))
mledist(log10EC50,"gumbel",start=list(a=0,b=2),optim.method="NelderMead")
# (7) scaling problem
# the simulated dataset (below) has particularly small values,
# hence without scaling (10^0),
# the optimization raises an error. The for loop shows how scaling by 10^i
# for i=1,...,6 makes the fitting procedure work correctly.
set.seed(1234)
x2 < rnorm(100, 1e4, 2e4)
for(i in 6:0)
cat(i, try(mledist(x*10^i, "cauchy")$estimate, silent=TRUE), "\n")
# (17) small example for the zeromodified geometric distribution
#
dzmgeom < function(x, p1, p2) p1 * (x == 0) + (1p1)*dgeom(x1, p2) #pdf
x2 < c(2, 4, 0, 40, 4, 21, 0, 0, 0, 2, 5, 0, 0, 13, 2) #simulated dataset
initp1 < function(x) list(p1=mean(x == 0)) #init as MLE
mledist(x2, "zmgeom", fix.arg=initp1, start=list(p2=1/2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.