loglogSurv: Survival function plots for checking the tail behaviour of...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/FitTail.R

Description

The log-log Survival functions are design for checking the tails of a single response variable (no explanatory should be involved). There are three different function:

a) the function loglogSurv1() which plot the right tails of the empirical log-log Survival function against loglog(y), where y is the variable of interest. The coefficient of a linear fit to the plot can be used as an estimated for Type I tails (see Chapter 17 in Rigby et al. (2019) for definition of the different types of tails.)

b) the function loglogSurv2() which plot the right tails of the empirical log-log Survival function against log(y). The coefficient of a linear fit to the plot can be used as an estimated for Type II tails.

c) the function loglogSurv3() which plot the (left or right) tails of the empirical log-log Survival function against y. The coefficient of a linear fit to the plot can be used as an estimated for Type III tails.

The function loglogSurv() combines all the above functions.

The function logSurv() is design for exploring the heavy tails of a single response variable. It plots the empirical log-survival function of the right tail of the distribution or the empirical log-cdf function of the left tail against log(y) for a specified probability of the tail. Then fits a linear, a quadratic and an exponential curve to the points of the plot. For distributions defined on the positive real line a good linear fit would indicate a Pareto type tail, a good quadratic fit a log-normal type tail and good exponential fit a Weibull type tail. Note that this function is only appropriate to investigate rather heavy tails and it is not very good to discriminate between different type of tails, as the loglogSurv(). The function logSurv0() plots but do not fit the curves.

The function loglogplot() plot the empirical log-survival function of all data against log(y). The function ECDF() calculates the empirical commutative distribution function. It is similar to ecdf() but divides by n+1 rather n, the number of conservations.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
loglogSurv(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), 
           ltype = 1, plot = TRUE, ...)

loglogSurv1(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), 
           ltype = 1, ...)

loglogSurv2(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), 
           ltype = 1, ...)
           
loglogSurv3(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), 
           ltype = 1, ...)
          
logSurv(y, prob = 0.9, tail = c("right", "left"), plot = TRUE, 
       lines = TRUE, print = TRUE, title = NULL, lcol = c(gray(0.1), 
       gray(0.2), gray(0.3)), ltype = c(1, 2, 3), ...)  

logSurv0(y, prob = 0.9, tail = c("right", "left"), plot = TRUE, 
           title = NULL, ...)
           
         
ECDF(y, weights=NULL)

loglogplot(y, nplus1 = TRUE, ...)

Arguments

y

a vector, the variable of interest

prob

what probability. The defaul is 0.90 which means 10% for "right" tail 90% for "left" tail

tail

which tall needs checking the right (default) of the left

plot

whether to plot with default equal TRUE

print

whether to print the coefficients with default equal TRUE

title

if a different title rather the default is needed

lcol

The line colour in the plot

lines

whether to plot the fitted lines

ltype

The line type in the plot

nplus1

whether to divide by n+1 or n when calculating the ecdf

weights

prior weights for ECDF()

...

for extra argument in the plot command

Details

The functions loglogSurv1(), loglogSurv3() and loglogSurv3() take the upper part of an ordered variable, create its empirical survival function, and plot the log-log of this functions against log(log(y)), log(y) and y, respectively. Then they fit a line to the plot. The coefficients of the line can be interpreted as parameters determined the behaviour of the tail. The function loglogSurv() fits all three models and displays the best.

The function logSurv() takes the upper (or lower) part of an ordered variable and plots the log empirical survival function against log(y). Also display three curves i) linear ii) quadratic and iii) exponential to determine what kind of tail relationship exist. Plotting the log empirical survival function against log(y) often call in the literature the "log-log plot".

The function loglogplot() plots the whole log empirical survival function against log(y) (not just the tail). The function ECDF() calculate the step function of the empirical cumulative distribution function.

More details can be found in Chapter 17 of "Rigby et al. (2019) book an old version on which can be found in https://www.gamlss.com/)

Value

The functions create plots.

Author(s)

Bob Rigby, Mikis Stasinopoulos and Vlassios Voudouris

References

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Rigby R.A., Stasinopoulos D. M., Heller G., and De Bastiani F., (2019) Distributions for Modelling Location, Scale and Shape: Using GAMLSS in R, Chapman and Hall/CRC. (In press)

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

(see also https://www.gamlss.com/).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data(film90)
y <- film90$lborev1
op<-par(mfrow=c(3,1))
loglogSurv1(y)
loglogSurv2(y)
loglogSurv3(y)
par(op)
loglogSurv(y)

logSurv(y)

loglogplot(y)

plot(ECDF(y), main="ECDF")

Example output

Loading required package: splines
Loading required package: gamlss.data

Attaching package: 'gamlss.data'

The following object is masked from 'package:datasets':

    sleep

Loading required package: gamlss.dist
Loading required package: MASS
Loading required package: nlme
Loading required package: parallel
 **********   GAMLSS Version 5.1-3  ********** 
For more on GAMLSS look at http://www.gamlss.org/
Type gamlssNews() to see new features/changes/bug fixes.

coefficients -27.68429 27.28962 
error sum of squares 0.1028886 
coefficients -26.027 9.447261 
error sum of squares 0.0929632 
coefficients -8.179461 0.5252454 
error sum of squares 0.0749298 
Linear regression coefficients 
          Intercept      slope  Error SS
type I   -27.684292  27.289616   0.10289
type II  -26.026996   9.447261   0.09296
type III  -8.179461   0.525245   0.07493
Estimates for parameters k 
          k:2,4,6 k:1,3,5
type I   9.48e-13   27.29
type II  4.97e-12    9.45
type III 2.80e-04    0.53

Family:  c("NO", "Normal") 
Fitting method: RS() 

Call:  gamlss(formula = newY ~ x3, trace = FALSE) 

Mu Coefficients:
(Intercept)           x3  
    -8.1795       0.5252  
Sigma Coefficients:
(Intercept)  
     -4.295  

 Degrees of Freedom for the fit: 3 Residual Deg. of Freedom   400 
Global Deviance:     -2318.16 
            AIC:     -2312.16 
            SBC:     -2300.17 

gamlss documentation built on March 31, 2021, 5:10 p.m.