# histSmo: Density estimation using the Poisson trick In gamlss: Generalised Additive Models for Location Scale and Shape

## Description

This set of functions use the old Poisson trick of discretising the data and then fitting a Poisson error model to the resulting frequencies (Lindsey, 1997). Here the model fitted is a smooth cubic spline curve. The result is a density estimator for the data.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```histSmo(y, lambda = NULL, df = NULL, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoC(y, df = 10, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoO(y, lambda = 1, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoP(y, lambda = NULL, df = NULL, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) ```

## Arguments

 `y` the variable of interest `lambda` the smoothing parameter `df` the degrees of freedom `order` the order of the P-spline `lower` the lower limit of the y-variable `upper` the upper limit of the y-variable `type` the type of histogram `plot` whether to plot the resulting density estimator `breaks` the number of break points to be used in the histogram and consequently the number of observations in the Poisson fit `discrete` whether to treat the fitting density as a discrete distribution or not `...` further arguments passed to or from other methods.

## Details

Here are the methods used here:

i) The function `histSmoO()` uses Penalised discrete splines (Eilers, 2003). This function is appropriate when the smoothing parameter is fixed.

ii) The function `histSmoC()` uses smooth cubic splines and fits a Poison error model to the frequencies using the `cs()` additive function of GAMLSS. This function is appropriate if the effective degrees of freedom are fixed in the model.

iii) The function `histSmoP()` uses Penalised cubic splines (Eilers and Marx 1996). It is fitting a Poisson model to the frequencies using the `pb()` additive function of GAMLSS. This function is appropriate if automatic selection of the smoothing parameter is required.

iv) The function `histSmo()` combines all the above functions in the sense that if lambda is fixed it uses `histSmoO()`, if the df's are fixed it uses codehistSmoC() and if none of these is specified it uses `histSmoP()`.

## Value

Returns a `histSmo` S3 object. The object has the following components:

 `x` the middle points of the discretise data `counts` how many observation are on the discretise intervals `density` the density value for each discrete interval `hist` the `hist` object used to discretise the data `cdf` The resulting cumulative distribution function useful for calculating probabilities from the estimate density `nvcdf` The inverse cumulative distribution function `model` The fitted Poisson model only for `histSmoP()` and `histSmoC()`

## Author(s)

Mikis Stasinopoulos, Paul Eilers, Bob Rigby and Vlasios Voudouris

## References

Eilers, P. (2003). A perfect smoother. Analytical Chemistry, 75: 3631-3636.

Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with comments and rejoinder). Statist. Sci, 11, 89-121.

Lindsey, J.K. (1997) Applying Generalized Linear Models. New York: Springer-Verlag. ISBN 0-387-98218-3

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

`pb`, `cs`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51``` ```# creating data from Pareto 2 distribution set.seed(153) Y <- rPARETO2(1000) ## Not run: # getting the density histSmo(Y, lower=0, plot=TRUE) # more breaks a bit slower histSmo(Y, breaks=200, lower=0, plot=TRUE) # quick fit using lambda histSmoO(Y, lambda=1, breaks=200, lower=0, plot=TRUE) # or histSmo(Y, lambda=1, breaks=200, lower=0, plot=TRUE) # quick fit using df histSmoC(Y, df=15, breaks=200, lower=0,plot=TRUE) # or histSmo(Y, df=15, breaks=200, lower=0) # saving results m1<- histSmo(Y, lower=0, plot=T) plot(m1) plot(m1, "cdf") plot(m1, "invcdf") # using with a histogram library(MASS) truehist(Y) lines(m1, col="red") #--------------------------- # now gererate from SHASH distribution YY <- rSHASH(1000) m1<- histSmo(YY) # calculate Pr(YY>10) 1-m1\$cdf(10) # calculate Pr(-10

### Example output                ```Loading required package: splines

Attaching package: ‘gamlss.data’

The following object is masked from ‘package:datasets’:

sleep

**********   GAMLSS Version 5.2-0  **********
For more on GAMLSS look at https://www.gamlss.com/
Type gamlssNews() to see new features/changes/bug fixes.

Warning message:
In regularize.values(x, y, ties, missing(ties)) :
collapsing to unique 'x' values
Warning message:
In regularize.values(x, y, ties, missing(ties)) :
collapsing to unique 'x' values
 0.0125035
 0.9699186
```

gamlss documentation built on March 31, 2021, 5:10 p.m.