semimetric.NPFDA: Proximities between functional data (semi-metrics)
In fda.usc: Functional Data Analysis and Utilities for Statistical Computing

semimetric.NPFDA

R Documentation

Proximities between functional data (semi-metrics)

Description

Computes semi-metric distances of functional data based on Ferraty F and Vieu, P. (2006).

Usage

semimetric.deriv(
  fdata1,
  fdata2 = fdata1,
  nderiv = 1,
  nknot = ifelse(floor(ncol(DATA1)/3) > floor((ncol(DATA1) - nderiv - 4)/2),
    floor((ncol(DATA1) - nderiv - 4)/2), floor(ncol(DATA1)/3)),
  ...
)

semimetric.fourier(
  fdata1,
  fdata2 = fdata1,
  nderiv = 0,
  nbasis = ifelse(floor(ncol(DATA1)/3) > floor((ncol(DATA1) - nderiv - 4)/2),
    floor((ncol(DATA1) - nderiv - 4)/2), floor(ncol(DATA1)/3)),
  period = NULL,
  ...
)

semimetric.hshift(fdata1, fdata2 = fdata1, t = 1:ncol(DATA1), ...)

semimetric.mplsr(fdata1, fdata2 = fdata1, q = 2, class1, ...)

semimetric.pca(fdata1, fdata2 = fdata1, q = 1, ...)

Arguments

`fdata1`	Functional data 1 or curve 1. `DATA1` with dimension (`n1` x `m`), where `n1` is the number of curves and `m` are the points observed in each curve.
`fdata2`	Functional data 2 or curve 2. `DATA1` with dimension (`n2` x `m`), where `n2` is the number of curves and `m` are the points observed in each curve.
`nderiv`	Order of derivation, used in `semimetric.deriv` and `semimetric.fourier`
`nknot`	semimetric.deriv argument: number of interior knots (needed for defining the B-spline basis).
`...`	Further arguments passed to or from other methods.
`nbasis`	`semimetric.fourier`: size of the basis.
`period`	`semimetric.fourier`:allows to select the period for the fourier expansion.
`t`	`semimetric.hshift`: vector which defines `t` (one can choose `1,2,...,nbt` where `nbt` is the number of points of the discretization)
`q`	If `semimetric.pca`: the retained number of principal components. If `semimetric.mplsr`: the retained number of factors.
`class1`	`semimetric.mplsr`: vector containing a categorical response which corresponds to class number for units stored in `DATA1`.

Details

semimetric.deriv: approximates L_2 metric between derivatives of the curves based on ther B-spline representation. The derivatives set with the argument nderiv.
semimetric.fourier: approximates L_2 metric between the curves based on ther B-spline representation. The derivatives set with the argument nderiv.
semimetric.hshift: computes distance between curves taking into account an horizontal shift effect.
semimetric.mplsr: computes distance between curves based on the partial least squares method.
semimetric.pca: computes distance between curves based on the functional principal components analysis method.

In the next semi-metric functions the functional data X is approximated by k_n elements of the Fourier, B–spline, PC or PLS basis using, \hat{X_i} =\sum_{k=1}^{k_n}\nu_{k,i}\xi_k, where \nu_k are the coefficient of the expansion on the basis function \left\{\xi_k\right\}_{k=1}^{\infty}.
The distances between the q-order derivatives of two curves X_{1} and X_2 is,

d_{2}^{(q)}\left(X_1,X_2\right)_{k_n}=\sqrt{\frac{1}{T}\int_{T}\left(X_{1}^{(q)}(t)-X_{2}^{(q)}(t)\right)^2 dt}

where X_{i}^{(q)}\left(t\right) denot the q derivative of X_i.

semimetric.deriv and semimetric.fourier function use a B-spline and Fourier approximation respectively for each curve and the derivatives are directly computed by differentiating several times their analytic form, by default q=1 and q=0 respectively. semimetric.pca and semimetric.mprls function compute proximities between curves based on the functional principal components analysis (FPCA) and the functional partial least square analysis (FPLS), respectively. The FPC and FPLS reduce the functional data in a reduced dimensional space (q components). semimetric.mprls function requires a scalar response.

d_{2}^{(q)}\left(X_1,X_2\right)_{k_n}\approx\sqrt{\sum_{k=1}^{k_n}\left(\nu_{k,1}-\nu_{k,2}\right)^2\left\|\xi_k^{(q)}\right\|dt}

semimetric.hshift computes proximities between curves taking into account an horizontal shift effect.

d_{hshift}\left(X_1,X_2\right)=\min_{h\in\left[-mh,mh\right]}d_2(X_1(t),X_2(t+h))

where mh is the maximum horizontal shifted allowed.

Value

Returns a proximities matrix between two functional datasets.

Source

https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NPFDA/

References

Ferraty, F. and Vieu, P. (2006). Nonparametric functional data analysis. Springer Series in Statistics, New York.

Ferraty, F. and Vieu, P. (2006). NPFDA in practice. Free access on line at https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NPFDA/

Examples

## Not run:  
#	INFERENCE PHONDAT
data(phoneme)
ind=1:100 # 2 groups
mlearn<-phoneme$learn[ind,]
mtest<-phoneme$test[ind,]
n=nrow(mlearn[["data"]])
np=ncol(mlearn[["data"]])
mdist1=semimetric.pca(mlearn,mtest)
mdist2=semimetric.pca(mlearn,mtest,q=2)
mdist3=semimetric.deriv(mlearn,mtest,nderiv=0)
mdist4=semimetric.fourier(mlearn,mtest,nderiv=2,nbasis=21)
#uses hshift function
#mdist5=semimetric.hshift(mlearn,mtest) #takes a lot
glearn<-phoneme$classlearn[ind]
#uses mplsr function
mdist6=semimetric.mplsr(mlearn,mtest,5,glearn)
mdist0=metric.lp(mlearn,mtest)
b=as.dist(mdist6)
c2=hclust(b)
plot(c2)
memb <- cutree(c2, k = 2)
table(memb,phoneme$classlearn[ind])
 
## End(Not run)

fda.usc documentation built on April 4, 2025, 4:35 a.m.