# create.basis: Create Basis Set for Functional Data Analysis In fda: Functional Data Analysis

## Description

Functional data analysis proceeds by selecting a finite basis set and fitting data to it. The current `fda` package supports fitting via least squares penalized with lambda times the integral over the (finite) support of the basis set of the squared deviations from a linear differential operator.

## Details

The most commonly used basis in `fda` is probably B-splines. For periodic phenomena, Fourier bases are quite useful. A constant basis is provided to facilitation arithmetic with functional data objects. To restrict attention to solutions of certain differential equations, it may be useful to use a corresponding basis set such as exponential, monomial or power basis sets.

Power bases support the use of negative and fractional powers, while monomial bases are restricted only to nonnegative integer exponents.

The polygonal basis is essentially a B-spline of order 2, degree 1.

The following summarizes arguments used by some or all of the current `create.basis` functions:

• rangeval a vector of length 2 giving the lower and upper limits of the range of permissible values for the function argument.

For `bspline` bases, this can be inferred from range(breaks). For `polygonal` bases, this can be inferred from range(argvals). In all other cases, this defaults to 0:1.

• nbasis an integer giving the number of basis functions.

This is not used for two of the `create.basis` functions: For `constant` this is 1, so there is no need to specify it. For `polygonal` bases, it is length(argvals), and again there is no need to specify it.

For `bspline` bases, if `nbasis` is not specified, it defaults to (length(breaks) + norder - 2) if `breaks` is provided. Otherwise, `nbasis` defaults to 20 for `bspline` bases.

For `exponential` bases, if `nbasis` is not specified, it defaults to length(ratevec) if `ratevec` is provided. Otherwise, in `fda_2.0.2`, `ratevec` defaults to 1, which makes `nbasis` = 1; in `fda_2.0.4`, `ratevec` will default to 0:1, so `nbasis` will then default to 2.

For `monomial` and `power` bases, if `nbasis` is not specified, it defaults to length(exponents) if `exponents` is provided. Otherwise, `nbasis` defaults to 2 for `monomial` and `power` bases. (Temporary exception: In `fda_2.0.2`, the default `nbasis` for `power` bases is 1. This will be increased to 2 in `fda_2.0.4`.)

In addition to `rangeval` and `nbasis`, all but `constant` bases have one or two parameters unique to that basis type or shared with one other:

• bspline Argument `norder` = the order of the spline, which is one more than the degree of the polynomials used. This defaults to 4, which gives cubic splines.

Argument `breaks` = the locations of the break or join points; also called `knots`. This defaults to seq(rangeval[1], rangeval[2], nbasis-norder+2).

• polygonal Argument `argvals` = the locations of the break or join points; also called `knots`. This defaults to seq(rangeval[1], rangeval[2], nbasis).

• fourier Argument `period` defaults to diff(rangeval).

• exponential Argument `ratevec`. In `fda_2.0.2`, this defaulted to 1. In `fda_2.0.3`, it will default to 0:1.

• monomial, power Argument `exponents`. Default = 0:(nbasis-1). For `monomial` bases, `exponents` must be distinct nonnegative integers. For `power` bases, they must be distinct real numbers.

Beginning with `fda_2.1.0`, the last 6 arguments for all the `create.basis` functions will be as follows; some but not all are available in the previous versions of `fda`:

• dropind a vector of integers specifiying the basis functions to be dropped, if any.

• quadvals a matrix with two columns and a number of rows equal to the number of quadrature points for numerical evaluation of the penalty integral. The first column of `quadvals` contains the quadrature points, and the second column the quadrature weights. A minimum of 5 values are required for each inter-knot interval, and that is often enough. For Simpson's rule, these points are equally spaced, and the weights are proportional to 1, 4, 2, 4, ..., 2, 4, 1.

• values a list of matrices with one row for each row of `quadvals` and one column for each basis function. The elements of the list correspond to the basis functions and their derivatives evaluated at the quadrature points contained in the first column of `quadvals`.

• basisvalues A list of lists, allocated by code such as vector("list",1). This field is designed to avoid evaluation of a basis system repeatedly at a set of argument values. Each list within the vector corresponds to a specific set of argument values, and must have at least two components, which may be tagged as you wish. 'The first component in an element of the list vector contains the argument values. The second component in an element of the list vector contains a matrix of values of the basis functions evaluated at the arguments in the first component. The third and subsequent components, if present, contain matrices of values their derivatives up to a maximum derivative order. Whenever function getbasismatrix is called, it checks the first list in each row to see, first, if the number of argument values corresponds to the size of the first dimension, and if this test succeeds, checks that all of the argument values match. This takes time, of course, but is much faster than re-evaluation of the basis system. Even this time can be avoided by direct retrieval of the desired array. For example, you might set up a vector of argument values called "evalargs" along with a matrix of basis function values for these argument values called "basismat". You might want too use tags like "args" and "values", respectively for these. You would then assign them to `basisvalues` with code such as the following:

basisobj\\$basisvalues <- vector("list",1)

basisobj\\$basisvalues[[1]] <- list(args=evalargs, values=basismat)

• names either a character vector of the same length as the number of basis functions or a simple stem used to construct such a vector.

For `bspline` bases, this defaults to paste('bspl', norder, '.', 1:nbreaks, sep=”).

For other bases, there are crudely similar defaults.

• axes an optional list used by selected `plot` functions to create custom `axes`. If this `axes` argument is not NULL, functions `plot.basisfd`, `plot.fd`, `plot.fdSmooth` `plotfit.fd`, `plotfit.fdSmooth`, and `plot.Lfd` will create axes via `do.call(x\$axes[[1]], x\$axes[-1])`. The primary example of this is to create `CanadianWeather` plots using `list("axesIntervals")`

## Author(s)

J. O. Ramsay and Spencer Graves

## References

Ramsay, James O., and Silverman, Bernard W. (2006), Functional Data Analysis, 2nd ed., Springer, New York.

Ramsay, James O., and Silverman, Bernard W. (2002), Applied Functional Data Analysis, Springer, New York.

`create.bspline.basis` `create.constant.basis` `create.exponential.basis` `create.fourier.basis` `create.monomial.basis` `create.polygonal.basis` `create.power.basis`