# loglogSurv: Survival function plots for checking the tail behaviour of... In gamlss: Generalised Additive Models for Location Scale and Shape

## Description

The log-log Survival functions are design for checking the tails of a single response variable (no explanatory should be involved). There are three different function:

a) the function `loglogSurv1()` which plot the right tails of the empirical log-log Survival function against `loglog(y)`, where y is the variable of interest. The coefficient of a linear fit to the plot can be used as an estimated for Type I tails (see Chapter 17 in Rigby et al. (2019) for definition of the different types of tails.)

b) the function `loglogSurv2()` which plot the right tails of the empirical log-log Survival function against `log(y)`. The coefficient of a linear fit to the plot can be used as an estimated for Type II tails.

c) the function `loglogSurv3()` which plot the (left or right) tails of the empirical log-log Survival function against `y`. The coefficient of a linear fit to the plot can be used as an estimated for Type III tails.

The function `loglogSurv()` combines all the above functions.

The function `logSurv()` is design for exploring the heavy tails of a single response variable. It plots the empirical log-survival function of the right tail of the distribution or the empirical log-cdf function of the left tail against `log(y)` for a specified probability of the tail. Then fits a linear, a quadratic and an exponential curve to the points of the plot. For distributions defined on the positive real line a good linear fit would indicate a Pareto type tail, a good quadratic fit a log-normal type tail and good exponential fit a Weibull type tail. Note that this function is only appropriate to investigate rather heavy tails and it is not very good to discriminate between different type of tails, as the `loglogSurv()`. The function `logSurv0()` plots but do not fit the curves.

The function `loglogplot()` plot the empirical log-survival function of all data against `log(y)`. The function `ECDF()` calculates the empirical commutative distribution function. It is similar to `ecdf()` but divides by `n+1` rather `n`, the number of conservations.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23``` ```loglogSurv(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), ltype = 1, plot = TRUE, ...) loglogSurv1(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), ltype = 1, ...) loglogSurv2(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), ltype = 1, ...) loglogSurv3(y, prob = 0.9, print = TRUE, title = NULL, lcol = gray(0.1), ltype = 1, ...) logSurv(y, prob = 0.9, tail = c("right", "left"), plot = TRUE, lines = TRUE, print = TRUE, title = NULL, lcol = c(gray(0.1), gray(0.2), gray(0.3)), ltype = c(1, 2, 3), ...) logSurv0(y, prob = 0.9, tail = c("right", "left"), plot = TRUE, title = NULL, ...) ECDF(y, weights=NULL) loglogplot(y, nplus1 = TRUE, ...) ```

## Arguments

 `y` a vector, the variable of interest `prob` what probability. The defaul is 0.90 which means 10% for "right" tail 90% for "left" tail `tail` which tall needs checking the right (default) of the left `plot` whether to plot with default equal `TRUE` `print` whether to print the coefficients with default equal `TRUE` `title` if a different title rather the default is needed `lcol` The line colour in the plot `lines` whether to plot the fitted lines `ltype` The line type in the plot `nplus1` whether to divide by n+1 or n when calculating the ecdf `weights` prior weights for `ECDF()` `...` for extra argument in the plot command

## Details

The functions `loglogSurv1()`, `loglogSurv3()` and `loglogSurv3()` take the upper part of an ordered variable, create its empirical survival function, and plot the log-log of this functions against `log(log(y))`, `log(y)` and `y`, respectively. Then they fit a line to the plot. The coefficients of the line can be interpreted as parameters determined the behaviour of the tail. The function `loglogSurv()` fits all three models and displays the best.

The function `logSurv()` takes the upper (or lower) part of an ordered variable and plots the log empirical survival function against log(y). Also display three curves i) linear ii) quadratic and iii) exponential to determine what kind of tail relationship exist. Plotting the log empirical survival function against log(y) often call in the literature the "log-log plot".

The function `loglogplot()` plots the whole log empirical survival function against log(y) (not just the tail). The function `ECDF()` calculate the step function of the empirical cumulative distribution function.

More details can be found in Chapter 17 of "Rigby et al. (2019) book an old version on which can be found in https://www.gamlss.com/)

## Value

The functions create plots.

## Author(s)

Bob Rigby, Mikis Stasinopoulos and Vlassios Voudouris

## References

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Rigby R.A., Stasinopoulos D. M., Heller G., and De Bastiani F., (2019) Distributions for Modelling Location, Scale and Shape: Using GAMLSS in R, Chapman and Hall/CRC. (In press)

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14``` ```data(film90) y <- film90\$lborev1 op<-par(mfrow=c(3,1)) loglogSurv1(y) loglogSurv2(y) loglogSurv3(y) par(op) loglogSurv(y) logSurv(y) loglogplot(y) plot(ECDF(y), main="ECDF") ```

### Example output     ```Loading required package: splines

Attaching package: 'gamlss.data'

The following object is masked from 'package:datasets':

sleep

**********   GAMLSS Version 5.1-3  **********
For more on GAMLSS look at http://www.gamlss.org/
Type gamlssNews() to see new features/changes/bug fixes.

coefficients -27.68429 27.28962
error sum of squares 0.1028886
coefficients -26.027 9.447261
error sum of squares 0.0929632
coefficients -8.179461 0.5252454
error sum of squares 0.0749298
Linear regression coefficients
Intercept      slope  Error SS
type I   -27.684292  27.289616   0.10289
type II  -26.026996   9.447261   0.09296
type III  -8.179461   0.525245   0.07493
Estimates for parameters k
k:2,4,6 k:1,3,5
type I   9.48e-13   27.29
type II  4.97e-12    9.45
type III 2.80e-04    0.53

Family:  c("NO", "Normal")
Fitting method: RS()

Call:  gamlss(formula = newY ~ x3, trace = FALSE)

Mu Coefficients:
(Intercept)           x3
-8.1795       0.5252
Sigma Coefficients:
(Intercept)
-4.295

Degrees of Freedom for the fit: 3 Residual Deg. of Freedom   400
Global Deviance:     -2318.16
AIC:     -2312.16
SBC:     -2300.17
```

gamlss documentation built on March 31, 2021, 5:10 p.m.