gofstat  R Documentation 
Computes goodnessoffit statistics for parametric distributions fitted to a same noncensored data set.
gofstat(f, chisqbreaks, meancount, discrete, fitnames=NULL) ## S3 method for class 'gofstat.fitdist' print(x, ...)
f 
An object of class 
chisqbreaks 
A numeric vector defining the breaks of the cells used to compute the chisquared
statistic. If omitted, these breaks are automatically computed from the data
in order to reach roughly the same number of observations per cell, roughly equal to the argument

meancount 
The mean number of observations per cell expected for the definition of the breaks
of the cells used to compute the chisquared statistic. This argument will not be taken into
account if the breaks are directly defined in the argument 
discrete 
If 
fitnames 
A vector defining the names of the fits. 
x 
An object of class 
... 
Further arguments to be passed to generic functions. 
Goodnessoffit statistics are computed. The Chisquared statistic is computed using cells defined
by the argument
chisqbreaks
or cells automatically defined from data, in order
to reach roughly the same number of observations per cell, roughly equal to the argument
meancount
, or sligthly more if there are some ties.
The choice to define cells from the empirical distribution (data), and not from the
theoretical distribution, was done to enable the comparison of Chisquared values obtained
with different distributions fitted on a same data set.
If chisqbreaks
and meancount
are both omitted, meancount
is fixed in order to obtain roughly (4n)^{2/5} cells,
with n the length of the data set (Vose, 2000).
The Chisquared statistic is not computed if the program fails
to define enough cells due to a too small dataset. When the Chisquared statistic is computed,
and if the degree of freedom (nb of cells  nb of parameters  1) of the corresponding distribution
is strictly positive, the pvalue of the Chisquared test is returned.
For continuous distributions, KolmogorovSmirnov, Cramervon Mises and AndersonDarling and statistics are also computed, as defined by Stephens (1986).
An approximate KolmogorovSmirnov test is performed by assuming the distribution parameters known. The critical value defined by Stephens (1986) for a completely specified distribution is used to reject or not the distribution at the significance level 0.05. Because of this approximation, the result of the test (decision of rejection of the distribution or not) is returned only for data sets with more than 30 observations. Note that this approximate test may be too conservative.
For data sets with more than 5 observations and for distributions for
which the test is described by Stephens (1986) for maximum likelihood estimations
("exp"
, "cauchy"
, "gamma"
and "weibull"
),
the Cramervon Mises and Andersondarling tests are performed as described by Stephens (1986).
Those tests take into
account the fact that the parameters are not known but estimated from the data by maximum likelihood.
The result is the
decision to reject or not the distribution at the significance level 0.05. Those tests are available
only for maximum likelihood estimations.
Only recommended statistics are automatically printed, i.e.
Cramervon Mises, AndersonDarling and Kolmogorov statistics for continuous distributions and
Chisquared statistics for discrete ones ( "binom"
,
"nbinom"
, "geom"
, "hyper"
and "pois"
).
Results of the tests are not printed but stored in the output of the function.
gof.stat
returns an object of class "gofstat.fitdist"
with following components,
chisq 
a named vector with the Chisquared statistics or 
chisqbreaks 
common breaks used to define cells in the Chisquared statistic 
chisqpvalue 
a named vector with the pvalues of the Chisquared statistic
or 
chisqdf 
a named vector with the degrees of freedom of the Chisquared distribution
or 
chisqtable 
a table with observed and theoretical counts used for the Chisquared calculations 
cvm 
a named vector of the Cramervon Mises statistics or 
cvmtest 
a named vector of the decisions of the Cramervon Mises test
or 
ad 
a named vector with the AndersonDarling statistics or

adtest 
a named vector with the decisions of the AndersonDarling test
or 
ks 
a named vector with the KolmogorovSmirnov statistic or

kstest 
a named vector with the decisions of the KolmogorovSmirnov test
or 
aic 
a named vector with the values of the Akaike's Information Criterion. 
bic 
a named vector with the values of the Bayesian Information Criterion. 
discrete 
the input argument or the automatic definition by the function from the first
object of class 
nbfit 
Number of fits in argument. 
MarieLaure DelignetteMuller and Christophe Dutang.
Cullen AC and Frey HC (1999), Probabilistic techniques in exposure assessment. Plenum Press, USA, pp. 81155.
Stephens MA (1986), Tests based on edf statistics. In Goodnessoffit techniques (D'Agostino RB and Stephens MA, eds), Marcel Dekker, New York, pp. 97194.
Venables WN and Ripley BD (2002), Modern applied statistics with S. Springer, New York, pp. 435446.
Vose D (2000), Risk analysis, a quantitative guide. John Wiley & Sons Ltd, Chischester, England, pp. 99143.
DelignetteMuller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 134.
fitdist
.
# (1) fit of two distributions to the serving size data # by maximum likelihood estimation # and comparison of goodnessoffit statistics # data(groundbeef) serving < groundbeef$serving (fitg < fitdist(serving, "gamma")) gofstat(fitg) (fitln < fitdist(serving, "lnorm")) gofstat(fitln) gofstat(list(fitg, fitln)) # (2) fit of two discrete distributions to toxocara data # and comparison of goodnessoffit statistics # data(toxocara) number < toxocara$number fitp < fitdist(number,"pois") summary(fitp) plot(fitp) fitnb < fitdist(number,"nbinom") summary(fitnb) plot(fitnb) gofstat(list(fitp, fitnb),fitnames = c("Poisson","negbin")) # (3) Use of Chisquared results in addition to # recommended statistics for continuous distributions # set.seed(1234) x4 < rweibull(n=1000,shape=2,scale=1) # fit of the good distribution f4 < fitdist(x4,"weibull") # fit of a bad distribution f4b < fitdist(x4,"cauchy") gofstat(list(f4,f4b),fitnames=c("Weibull", "Cauchy"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.