fitdistcens  R Documentation 
Fits a univariate distribution to censored data by maximum likelihood.
fitdistcens(censdata, distr, start=NULL, fix.arg=NULL, keepdata = TRUE, keepdata.nb=100, ...) ## S3 method for class 'fitdistcens' print(x, ...) ## S3 method for class 'fitdistcens' plot(x, ...) ## S3 method for class 'fitdistcens' summary(object, ...) ## S3 method for class 'fitdistcens' logLik(object, ...) ## S3 method for class 'fitdistcens' vcov(object, ...) ## S3 method for class 'fitdistcens' coef(object, ...)
censdata 
A dataframe of two columns respectively named 
distr 
A character string 
start 
A named list giving the initial values of parameters of the named distribution.
This argument may be omitted for some distributions for which reasonable
starting values are computed (see the 'details' section of 
fix.arg 
An optional named list giving the values of parameters of the named distribution that must be kept fixed rather than estimated by maximum likelihood. 
x 
an object of class 
object 
an object of class 
keepdata 
a logical. If 
keepdata.nb 
When 
... 
further arguments to be passed to generic functions,
to the function 
Maximum likelihood estimations of the distribution parameters are computed using
the function mledist
.
By default direct optimization of the loglikelihood is performed using optim
,
with the "NelderMead" method for distributions characterized by more than one parameter
and the "BFGS" method for distributions characterized by only one parameter.
The algorithm used in optim
can be chosen or another optimization function
can be specified using ... argument (see mledist
for details).
start
may be omitted (i.e. NULL
) for some classic distributions
(see the 'details' section of mledist
).
Note that when errors are raised by optim
, it's a good idea to start by adding traces during
the optimization process by adding control=list(trace=1, REPORT=1)
in ... argument.
The function is not able to fit a uniform distribution.
With the parameter estimates, the function returns the loglikelihood and the standard errors of
the estimates calculated from the
Hessian at the solution found by optim
or by the usersupplied function passed to mledist.
By default (keepdata = TRUE
), the object returned by fitdist
contains
the data vector given in input.
When dealing with large datasets, we can remove the original dataset from the output by
setting keepdata = FALSE
. In such a case, only keepdata.nb
points (at most)
are kept by random subsampling keepdata.nb
4 points from the dataset and
adding the componentwise minimum and maximum.
If combined with bootdistcens
, be aware that bootstrap is performed on the subset
randomly selected in fitdistcens
. Currently, the graphical comparisons of multiple fits
is not available in this framework.
Weighted version of the estimation process is available for method = "mle"
by using weights=...
. See the corresponding man page for details.
It is not yet possible to take into account weighths in functions plotdistcens,
plot.fitdistcens and cdfcompcens
(developments planned in the future).
fitdistcens
returns an object of class "fitdistcens"
, a list with the following components:
estimate 
the parameter estimates. 
method 
the character string coding for the fitting method :
only 
sd 
the estimated standard errors. 
cor 
the estimated correlation matrix, 
vcov 
the estimated variancecovariance matrix, 
loglik 
the loglikelihood. 
aic 
the Akaike information criterion. 
bic 
the the socalled BIC or SBC (Schwarz Bayesian criterion). 
censdata 
the censored data set. 
distname 
the name of the distribution. 
fix.arg 
the named list giving the values of parameters of the named distribution
that must be kept fixed rather than estimated by maximum likelihood or

fix.arg.fun 
the function used to set the value of 
dots 
the list of further arguments passed in ... to be used in 
convergence 
an integer code for the convergence of

discrete 
always 
weights 
the vector of weigths used in the estimation process or 
Generic functions:
print
The print of a "fitdist"
object shows few traces about the fitting method and the fitted distribution.
summary
The summary provides the parameter estimates of the fitted distribution, the loglikelihood, AIC and BIC statistics, the standard errors of the parameter estimates and the correlation matrix between parameter estimates.
plot
The plot of an object of class "fitdistcens"
returned by fitdistcens
uses the
function plotdistcens
.
logLik
Extracts the estimated loglikelihood from the "fitdistcens"
object.
vcov
Extracts the estimated varcovariance matrix from the "fitdistcens"
object
(only available When method = "mle"
).
coef
Extracts the fitted coefficients from the "fitdistcens"
object.
MarieLaure DelignetteMuller and Christophe Dutang.
Venables WN and Ripley BD (2002), Modern applied statistics with S. Springer, New York, pp. 435446.
DelignetteMuller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 134.
See fitdistrplus
for an overview of the package.
plotdistcens
, optim
, mledist
,
fitdist
and quantile.fitdistcens
for another generic function to calculate
quantiles from the fitted distribution.
# (1) Fit of a lognormal distribution to bacterial contamination data # data(smokedfish) fitsf < fitdistcens(smokedfish,"lnorm") summary(fitsf) # default plot using the Wang technique (see ?plotdiscens for details) plot(fitsf) # plot using the Turnbull algorithm (see ?plotdiscens for details) # with confidence intervals for the empirical distribution plot(fitsf, NPMLE = TRUE, NPMLE.method = "Turnbull", Turnbull.confint = TRUE) # basic plot using intervals and points (see ?plotdiscens for details) plot(fitsf, NPMLE = FALSE) # plot of the same fit using the Turnbull algorithm in logscale cdfcompcens(fitsf,main="bacterial contamination fits", xlab="bacterial concentration (CFU/g)",ylab="F", addlegend = FALSE,lines01 = TRUE, xlogscale = TRUE, xlim = c(1e2,1e2)) # zoom on large values of F cdfcompcens(fitsf,main="bacterial contamination fits", xlab="bacterial concentration (CFU/g)",ylab="F", addlegend = FALSE,lines01 = TRUE, xlogscale = TRUE, xlim = c(1e2,1e2),ylim=c(0.4,1)) # (2) Fit of a normal distribution on acute toxicity values # of fluazinam (in decimal logarithm) for # macroinvertebrates and zooplancton, using maximum likelihood estimation # to estimate what is called a species sensitivity distribution # (SSD) in ecotoxicology # data(fluazinam) log10EC50 <log10(fluazinam) fln < fitdistcens(log10EC50,"norm") fln summary(fln) plot(fln) # (3) defining your own distribution functions, here for the Gumbel distribution # for other distributions, see the CRAN task view dedicated to # probability distributions # dgumbel < function(x,a,b) 1/b*exp((ax)/b)*exp(exp((ax)/b)) pgumbel < function(q,a,b) exp(exp((aq)/b)) qgumbel < function(p,a,b) ab*log(log(p)) fg < fitdistcens(log10EC50,"gumbel",start=list(a=1,b=1)) summary(fg) plot(fg) # (4) comparison of fits of various distributions # fll < fitdistcens(log10EC50,"logis") summary(fll) cdfcompcens(list(fln,fll,fg),legendtext=c("normal","logistic","gumbel"), xlab = "log10(EC50)") # (5) how to change the optimisation method? # fitdistcens(log10EC50,"logis",optim.method="NelderMead") fitdistcens(log10EC50,"logis",optim.method="BFGS") fitdistcens(log10EC50,"logis",optim.method="SANN") # (6) custom optimisation function  example with the genetic algorithm # #wrap genoud function rgenoud package mygenoud < function(fn, par, ...) { require(rgenoud) res < genoud(fn, starting.values=par, ...) standardres < c(res, convergence=0) return(standardres) } # call fitdistcens with a 'custom' optimization function fit.with.genoud < fitdistcens(log10EC50,"logis", custom.optim=mygenoud, nvars=2, Domains=cbind(c(0,0), c(5, 5)), boundary.enforcement=1, print.level=1, hessian=TRUE) summary(fit.with.genoud) # (7) estimation of the mean of a normal distribution # by maximum likelihood with the standard deviation fixed at 1 using the argument fix.arg # flnb < fitdistcens(log10EC50, "norm", start = list(mean = 1),fix.arg = list(sd = 1)) # (8) Fit of a lognormal distribution on acute toxicity values of fluazinam for # macroinvertebrates and zooplancton, using maximum likelihood estimation # to estimate what is called a species sensitivity distribution # (SSD) in ecotoxicology, followed by estimation of the 5 percent quantile value of # the fitted distribution (which is called the 5 percent hazardous concentration, HC5, # in ecotoxicology) and estimation of other quantiles. data(fluazinam) log10EC50 <log10(fluazinam) fln < fitdistcens(log10EC50,"norm") quantile(fln, probs = 0.05) quantile(fln, probs = c(0.05, 0.1, 0.2)) # (9) Fit of a lognormal distribution on 72hour acute salinity tolerance (LC50 values) # of riverine macroinvertebrates using maximum likelihood estimation data(salinity) log10LC50 <log10(salinity) fln < fitdistcens(log10LC50,"norm") plot(fln)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.