# smoothSplines: Estimate density from histogram In robCompositions: Compositional Data Analysis

 smoothSplines R Documentation

## Estimate density from histogram

### Description

Given raw (discretized) distributional observations, `smoothSplines` computes the density function that 'best' fits data, in a trade-off between smooth and least squares approximation, using B-spline basis functions.

### Usage

``````smoothSplines(
k,
l,
alpha,
data,
xcp,
knots,
weights = matrix(1, dim(data)[1], dim(data)[2]),
num_points = 100,
prior = "default",
cores = 1,
fast = 0
)
``````

### Arguments

 `k` smoothing splines degree `l` order of derivative in the penalization term `alpha` weight for penalization `data` an object of class "matrix" containing data to be smoothed, row by row `xcp` vector of control points `knots` either vector of knots for the splines or a integer for the number of equispaced knots `weights` matrix of weights. If not given, all data points will be weighted the same. `num_points` number of points of the grid where to evaluate the density estimated `prior` prior used for zero-replacements. This must be one of "perks", "jeffreys", "bayes_laplace", "sq" or "default" `cores` number of cores for parallel execution, if the option was enabled before installing the package `fast` 1 if maximal performance is required (print statements suppressed), 0 otherwise

### Details

The original discretized densities are not directly smoothed, but instead the centred logratio transformation is first applied, to deal with the unit integral constraint related to density functions.
Then the constrained variational problem is set. This minimization problem for the optimal density is a compromise between staying close to the given data, at the corresponding `xcp`, and obtaining a smooth function. The non-smoothness measure takes into account the `l`th derivative, while the fidelity term is weigthed by `alpha`.
The solution is a natural spline. The vector of its coefficients is obtained by the minimum norm solution of a linear system. The resulting splines can be either back-transformed to the original Bayes space of density functions (in order to provide their smoothed counterparts for vizualization and interpretation purposes), or retained for further statistical analysis in the clr space.

### Value

An object of class `smoothSpl`, containing among the other the following variables:

 `bspline` each row is the vector of B-spline coefficients `Y` the values of the smoothed curve, for the grid given `Y_clr` the values of the smoothed curve, in the clr setting, for the grid given

### Author(s)

Alessia Di Blasi, Federico Pavone, Gianluca Zeni, Matthias Templ

### References

J. Machalova, K. Hron & G.S. Monti (2016): Preprocessing of centred logratio transformed density functions using smoothing splines. Journal of Applied Statistics, 43:8, 1419-1435.

### Examples

``````SepalLengthCm <- iris\$Sepal.Length
Species <- iris\$Species

iris1 <- SepalLengthCm[iris\$Species==levels(iris\$Species)[1]]
h1 <- hist(iris1, nclass = 12, plot = FALSE)

midx1 <- h1\$mids
midy1 <- matrix(h1\$density, nrow=1, ncol = length(h1\$density), byrow=TRUE)
knots <- 7
## Not run:
sol1 <- smoothSplines(k=3,l=2,alpha=1000,midy1,midx1,knots)
plot(sol1)

h1 <- hist(iris1, freq = FALSE, nclass = 12, xlab = "Sepal Length     [cm]", main = "Iris setosa")
# black line: kernel method; red line: smoothSplines result
lines(density(iris1), col = "black", lwd = 1.5)
xx1 <- seq(sol1\$Xcp[1],tail(sol1\$Xcp,n=1),length.out = sol1\$NumPoints)
lines(xx1,sol1\$Y[1,], col = 'red', lwd = 2)

## End(Not run)
``````

robCompositions documentation built on Aug. 25, 2023, 5:13 p.m.