Fit distributions via linear moments

Share:

Description

Fit several distributions via linear moments, plot histogram and distribution densities or ecdf with cumulated probability. Also returns goodness of fit values. This is the main fitting function calling distLgof and distLgofPlot or distLplot

Usage

1
2
3
4
5
distLfit(dat, datname, speed = TRUE, ks = FALSE, selection = NULL,
  gofProp = 1, weightc = NA, truncate = 0,
  threshold = berryFunctions::quantileMean(dat, truncate), gofComp = FALSE,
  progbars = length(dat) > 200, time = TRUE, plot = TRUE, cdf = FALSE,
  legargs = NULL, histargs = NULL, quiet = FALSE, ssquiet = quiet, ...)

Arguments

dat

Vector with values

datname

Character string for main, xlab etc. DEFAULT: internal substitute(dat)

speed

If TRUE, several distributions are omitted, for the reasons shown in lmomco::dist.list(). DEFAULT: TRUE

ks

Include ks.test results in dlf$gof? Computing is much faster when FALSE. DEFAULT: FALSE

selection

Selection of distributions. Character vector with types as in lmom2par. Overrides speed. DEFAULT: NULL

gofProp

Upper proportion (0:1) of dat to compute goodness of fit (dist / ecdf) with. This enables to focus on the dist tail. DEFAULT: 1

weightc

Named custom weights for each distribution, see distLgof. DEFAULT: NA

truncate

Number between 0 and 1. POT Censored distLquantile: fit to highest values only (truncate lower proportion of x). Probabilities are adjusted accordingly. DEFAULT: 0

threshold

POT cutoff value. If you want correct percentiles, set this only via truncate, see Details of q_gpd. DEFAULT: quantileMean(x, truncate)

gofComp

If TRUE, plots a comparison of the ranks of different GOF-methods and sets plot to FALSE. DEFAULT: FALSE

progbars

Show progress bars for each loop? DEFAULT: TRUE if n > 200

time

message execution time? DEFAULT: TRUE

plot

Should a histogram with densities be plotted? DEFAULT: TRUE

cdf

If TRUE, plot cumulated DF instead of probability density. DEFAULT: FALSE

legargs

List of arguments passed to legend except for legend and col. DEFAULT: NULL

histargs

List of arguments passed to hist except for x, breaks, col, xlim, freq. DEFAULT: NULL

quiet

Suppress notes? DEFAULT: FALSE

ssquiet

Suppress sample size notes? DEFAULT: quiet

...

Further arguments passed to distLplot if they are accepted there, else passed to lines, like lty, type, pch, ...

Details

Fits parameters via lmom2par in the package lmomco

Value

List as explained in extremeStat.

Author(s)

Berry Boessenkool, berry-b@gmx.de, Sept 2014 + July 2015

See Also

distLgof, distLplot. fevd in the package extRemes, fitdistr in the package MASS.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
data(annMax)
# basic usage on real data (annual discharge maxima in Austria)
dlf <- distLfit(annMax)
str(dlf, max.lev=2)
distLprint(dlf)

# arguments that can be passed:
distLfit(annMax, lty=2, col=3, legargs=list(lwd=3), main="booh!")
set.seed(42)
dlf_b <- distLfit(rbeta(100, 5, 2), nbest=10, legargs=c(x="left"))
distLplot(dlf_b, selection=c("gpa", "glo", "gev", "wak"))
distLplot(dlf_b, selection=c("gpa", "glo", "gev", "wak"), order=TRUE)
distLplot(dlf_b, coldist=c("orange",3:6), lty=1:3) # lty is recycled
distLplot(dlf_b, cdf=TRUE)
distLplot(dlf_b, cdf=TRUE, histargs=list(do.points=FALSE), sel="nor")


# Goodness of Fit is computed by RMSE, see first example of ?distLgof

# logarithmic axes:
set.seed(1)
y <- 10^rnorm(100, mean=2, sd=0.3) # if you use 1e4, distLgof will be much slower
hist(y, breaks=20)
berryFunctions::logHist(y, col=8)
dlf <- distLfit(log10(y), breaks=50)
distLplot(dlf, breaks=50, log=TRUE)

## Not run: 
# this takes a while, as it tries to fit all 30 distributions:
d_all <- distLfit(annMax, gofProp=1, speed=FALSE, plot=FALSE) # 35 sec
distLprint(d_all)
distLplot(d_all, nbest=22, coldist=grey(1:22/29), xlim=c(20,140))
distLplot(d_all, nbest=22, histargs=list(ylim=c(0,0.04)), xlim=c(20,140))
d_all$gof

## End(Not run)