# histSmo: Density estimation using the Poisson trick In gamlss: Generalised Additive Models for Location Scale and Shape

## Description

This set of functions use the old Poisson trick of discretising the data and then fitting a Poisson error model to the resulting frequencies (Lindsey, 1997). Here the model fitted is a smooth cubic spline curve. The result is a density estimator for the data.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```histSmo(y, lambda = NULL, df = NULL, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoC(y, df = 10, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoO(y, lambda = 1, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) histSmoP(y, lambda = NULL, df = NULL, order = 3, lower = NULL, upper = NULL, type = c("freq", "prob"), plot = FALSE, breaks = NULL, discrete = FALSE, ...) ```

## Arguments

 `y` the variable of interest `lambda` the smoothing parameter `df` the degrees of freedom `order` the order of the P-spline `lower` the lower limit of the y-variable `upper` the upper limit of the y-variable `type` the type of histogram `plot` whether to plot the resulting density estimator `breaks` the number of break points to be used in the histogram and consequently the number of observations in the Poisson fit `discrete` whether to treat the fitting density as a discrete distribution or not `...` further arguments passed to or from other methods.

## Details

Here are the methods used here:

i) The function `histSmoO()` uses Penalised discrete splines (Eilers, 2003). This function is appropriate when the smoothing parameter is fixed.

ii) The function `histSmoC()` uses smooth cubic splines and fits a Poison error model to the frequencies using the `cs()` additive function of GAMLSS. This function is appropriate if the effective degrees of freedom are fixed in the model.

iii) The function `histSmoP()` uses Penalised cubic splines (Eilers and Marx 1996). It is fitting a Poisson model to the frequencies using the `pb()` additive function of GAMLSS. This function is appropriate if automatic selection of the smoothing parameter is required.

iv) The function `histSmo()` combines all the above functions in the sense that if lambda is fixed it uses `histSmoO()`, if the df's are fixed it uses codehistSmoC() and if none of these is specified it uses `histSmoP()`.

## Value

Returns a `histSmo` S3 object. The object has the following components:

 `x` the middle points of the discretise data `counts` how many observation are on the discretise intervals `density` the density value for each discrete interval `hist` the `hist` object used to discretise the data `cdf` The resulting cumulative distribution function useful for calculating probabilities from the estimate density `nvcdf` The inverse cumulative distribution function `model` The fitted Poisson model only for `histSmoP()` and `histSmoC()`

## Author(s)

Mikis Stasinopoulos, Paul Eilers, Bob Rigby and Vlasios Voudouris

`pb`, `cs`

## Examples

 # creating data from Pareto 2 distribution
set.seed(153)
Y <- rPARETO2(1000)
## Not run:
# getting the density
histSmo(Y, lower=0, plot=TRUE)
# more breaks a bit slower
histSmo(Y, breaks=200, lower=0, plot=TRUE)
# quick fit using lambda
histSmoO(Y, lambda=1, breaks=200, lower=0, plot=TRUE)
# or
histSmo(Y, lambda=1, breaks=200, lower=0, plot=TRUE)
# quick fit using df
histSmoC(Y, df=15, breaks=200, lower=0,plot=TRUE)
# or
histSmo(Y, df=15, breaks=200, lower=0)
# saving results
m1<- histSmo(Y, lower=0, plot=T)
plot(m1)
plot(m1, "cdf")
plot(m1, "invcdf")
# using with a histogram
library(MASS)
truehist(Y)
lines(m1, col="red")
#---------------------------
# now gererate from SHASH distribution
YY <- rSHASH(1000)
m1<- histSmo(YY)
# calculate Pr(YY>10)
1-m1\$cdf(10)
# calculate Pr(-10

### Example output                ```Loading required package: splines

Attaching package: ‘gamlss.data’

The following object is masked from ‘package:datasets’:

sleep

**********   GAMLSS Version 5.2-0  **********
For more on GAMLSS look at https://www.gamlss.com/
Type gamlssNews() to see new features/changes/bug fixes.

Warning message:
In regularize.values(x, y, ties, missing(ties)) :
collapsing to unique 'x' values
Warning message:
In regularize.values(x, y, ties, missing(ties)) :
collapsing to unique 'x' values
 0.0125035
 0.9699186
```

