# logspline: Logspline Density Estimation In logspline: Routines for Logspline Density Estimation

## Description

Fits a logspline density using splines to approximate the log-density using the 1997 knot addition and deletion algorithm (logspline). The 1992 algorithm is available using the oldlogspline function.

## Usage

 1 2 logspline(x, lbound, ubound, maxknots = 0, knots, nknots = 0, penalty, silent = TRUE, mind = -1, error.action = 2)

## Arguments

 x data vector. The data needs to be uncensored. oldlogspline can deal with right- left- and interval-censored data. lbound,ubound lower/upper bound for the support of the density. For example, if there is a priori knowledge that the density equals zero to the left of 0, and has a discontinuity at 0, the user could specify lbound = 0. However, if the density is essentially zero near 0, one does not need to specify lbound. maxknots the maximum number of knots. The routine stops adding knots when this number of knots is reached. The method has an automatic rule for selecting maxknots if this parameter is not specified. knots ordered vector of values (that should cover the complete range of the observations), which forces the method to start with these knots. Overrules knots. If knots is not specified, a default knot-placement rule is employed. nknots forces the method to start with nknots knots. The method has an automatic rule for selecting nknots if this parameter is not specified. penalty the parameter to be used in the AIC criterion. The method chooses the number of knots that minimizes -2 * loglikelihood + penalty * (number of knots - 1). The default is to use a penalty parameter of penalty = log(samplesize) as in BIC. The effect of this parameter is summarized in summary.logspline. silent should diagnostic output be printed? mind minimum distance, in order statistics, between knots. error.action how should logspline deal with non-convergence problems? Very-very rarely in some extreme situations logspline has convergence problems. The only two situations that I am aware of are when there is effectively a sharp bound, but this bound was not specified, or when the data is severly rounded. logspline can deal with this in three ways. If error.action is 2, the same data is rerun with the slightly more stable, but less flexible oldlogspline. The object is translated in a logspline object using oldlogspline.to.logspline, so this is almost invisible to the user. It is particularly useful when you run simulation studies, as he code can seemlessly continue. Only the lbound and ubound options are passed on to oldlogspline, other options revert to the default. If error.action is 1, a warning is printed, and logspline returns nothing (but does not crash). This is useful if you run a simulation, but do not like to revert to oldlogspline. If error.action is 0, the code crashes using the stop function.

## Value

Object of the class logspline, that is intended as input for plot.logspline (summary plots), summary.logspline (fitting summary), dlogspline (densities), plogspline (probabilities), qlogspline (quantiles), rlogspline (random numbers from the fitted distribution).

The object has the following members:

 call the command that was executed. nknots the number of knots in the model that was selected. coef.pol coefficients of the polynomial part of the spline. The first coefficient is the constant term and the second is the linear term. coef.kts coefficients of the knots part of the spline. The k-th element is the coefficient of (x-t(k))^3_+ (where x^3_+ means the positive part of the third power of x, and t(k) means knot k). knots vector of the locations of the knots in the logspline model. maxknots the largest number of knots minus one considered during fitting (i.e. with maxknots = 6 the maximum number of knots is 5). penalty the penalty that was used. bound first element: 0 - lbound was -infinity, 1 it was something else; second element: lbound, if specified; third element: 0 - ubound was infinity, 1 it was something else; fourth element: ubound, if specified. samples the sample size. logl matrix with 3 columns. Column one: number of knots; column two: model fitted during addition (1) or deletion (2); column 3: log-likelihood. range range of the input data. mind minimum distance in order statistics between knots required during fitting (the actual minimum distance may be much larger).

## Author(s)

Charles Kooperberg clk@fredhutch.org.

## References

Charles Kooperberg and Charles J. Stone. Logspline density estimation for censored data (1992). Journal of Computational and Graphical Statistics, 1, 301–328.

Charles J. Stone, Mark Hansen, Charles Kooperberg, and Young K. Truong. The use of polynomial splines and their tensor products in extended linear modeling (with discussion) (1997). Annals of Statistics, 25, 1371–1470.

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 y <- rnorm(100) fit <- logspline(y) plot(fit) # # as (4 == length(-2, -1, 0, 1, 2) -1), this forces these initial knots, # and does no knot selection fit <- logspline(y, knots = c(-2, -1, 0, 1, 2), maxknots = 4, penalty = 0) # # the following example give one of the rare examples where logspline # crashes, and this shows the use of error.action = 2. # set.seed(118) zz <- rnorm(300) zz[151:300] <- zz[151:300]+5 zz <- round(zz) fit <- logspline(zz) # # you could rerun this with # fit <- logspline(zz, error.action=0) # or # fit <- logspline(zz, error.action=1)

### Example output

running with maximum degrees of freedom
Warning messages:
1: In logspline(zz) : too many knots beyond data
2: In logspline(zz) : re-ran with oldlogspline

logspline documentation built on July 2, 2020, 4:04 a.m.