# semimetric.NPFDA: Proximities between functional data (semi-metrics) In fda.usc: Functional Data Analysis and Utilities for Statistical Computing

 semimetric.NPFDA R Documentation

## Proximities between functional data (semi-metrics)

### Description

Computes semi-metric distances of functional data based on Ferraty F and Vieu, P. (2006).

### Usage

semimetric.deriv(
fdata1,
fdata2 = fdata1,
nderiv = 1,
nknot = ifelse(floor(ncol(DATA1)/3) > floor((ncol(DATA1) - nderiv - 4)/2),
floor((ncol(DATA1) - nderiv - 4)/2), floor(ncol(DATA1)/3)),
...
)

semimetric.fourier(
fdata1,
fdata2 = fdata1,
nderiv = 0,
nbasis = ifelse(floor(ncol(DATA1)/3) > floor((ncol(DATA1) - nderiv - 4)/2),
floor((ncol(DATA1) - nderiv - 4)/2), floor(ncol(DATA1)/3)),
period = NULL,
...
)

semimetric.hshift(fdata1, fdata2 = fdata1, t = 1:ncol(DATA1), ...)

semimetric.mplsr(fdata1, fdata2 = fdata1, q = 2, class1, ...)

semimetric.pca(fdata1, fdata2 = fdata1, q = 1, ...)


### Arguments

 fdata1 Functional data 1 or curve 1. DATA1 with dimension (n1 x m), where n1 is the number of curves and m are the points observed in each curve. fdata2 Functional data 2 or curve 2. DATA1 with dimension (n2 x m), where n2 is the number of curves and m are the points observed in each curve. nderiv Order of derivation, used in semimetric.deriv and semimetric.fourier nknot semimetric.deriv argument: number of interior knots (needed for defining the B-spline basis). ... Further arguments passed to or from other methods. nbasis semimetric.fourier: size of the basis. period semimetric.fourier:allows to select the period for the fourier expansion. t semimetric.hshift: vector which defines t (one can choose 1,2,...,nbt where nbt is the number of points of the discretization) q If semimetric.pca: the retained number of principal components. If semimetric.mplsr: the retained number of factors. class1 semimetric.mplsr: vector containing a categorical response which corresponds to class number for units stored in DATA1.

### Details

semimetric.deriv: approximates L_2 metric between derivatives of the curves based on ther B-spline representation. The derivatives set with the argument nderiv.
semimetric.fourier: approximates L_2 metric between the curves based on ther B-spline representation. The derivatives set with the argument nderiv.
semimetric.hshift: computes distance between curves taking into account an horizontal shift effect.
semimetric.mplsr: computes distance between curves based on the partial least squares method.
semimetric.pca: computes distance between curves based on the functional principal components analysis method.

In the next semi-metric functions the functional data X is approximated by k_n elements of the Fourier, B–spline, PC or PLS basis using, \hat{X_i} =∑_{k=1}^{k_n}ν_{k,i}ξ_k, where ν_k are the coefficient of the expansion on the basis function ≤ft\{ξ_k\right\}_{k=1}^{∞}.
The distances between the q-order derivatives of two curves X_{1} and X_2 is,

d_{2}^{(q)}≤ft(X_1,X_2\right)_{k_n}=√{\frac{1}{T}\int_{T}≤ft(X_{1}^{(q)}(t)-X_{2}^{(q)}(t)\right)^2 dt}

where X_{i}^{(q)}≤ft(t\right) denot the q derivative of X_i.

semimetric.deriv and semimetric.fourier function use a B-spline and Fourier approximation respectively for each curve and the derivatives are directly computed by differentiating several times their analytic form, by default q=1 and q=0 respectively. semimetric.pca and semimetric.mprls function compute proximities between curves based on the functional principal components analysis (FPCA) and the functional partial least square analysis (FPLS), respectively. The FPC and FPLS reduce the functional data in a reduced dimensional space (q components). semimetric.mprls function requires a scalar response.

d_{2}^{(q)}≤ft(X_1,X_2\right)_{k_n}\approx√{∑_{k=1}^{k_n}≤ft(ν_{k,1}-ν_{k,2}\right)^2≤ft\|ξ_k^{(q)}\right\|dt}

semimetric.hshift computes proximities between curves taking into account an horizontal shift effect.

d_{hshift}≤ft(X_1,X_2\right)=\min_{h\in≤ft[-mh,mh\right]}d_2(X_1(t),X_2(t+h))

where mh is the maximum horizontal shifted allowed.

### Value

Returns a proximities matrix between two functional datasets.

### References

Ferraty, F. and Vieu, P. (2006). Nonparametric functional data analysis. Springer Series in Statistics, New York.

Ferraty, F. and Vieu, P. (2006). NPFDA in practice. Free access on line at https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NPFDA/

See also metric.lp and semimetric.basis

### Examples

## Not run:
#	INFERENCE PHONDAT
data(phoneme)
ind=1:100 # 2 groups
mlearn<-phoneme$learn[ind,] mtest<-phoneme$test[ind,]
n=nrow(mlearn[["data"]])
np=ncol(mlearn[["data"]])
mdist1=semimetric.pca(mlearn,mtest)
mdist2=semimetric.pca(mlearn,mtest,q=2)
mdist3=semimetric.deriv(mlearn,mtest,nderiv=0)
mdist4=semimetric.fourier(mlearn,mtest,nderiv=2,nbasis=21)
#uses hshift function
#mdist5=semimetric.hshift(mlearn,mtest) #takes a lot
glearn<-phoneme$classlearn[ind] #uses mplsr function mdist6=semimetric.mplsr(mlearn,mtest,5,glearn) mdist0=metric.lp(mlearn,mtest) b=as.dist(mdist6) c2=hclust(b) plot(c2) memb <- cutree(c2, k = 2) table(memb,phoneme$classlearn[ind])

## End(Not run)



fda.usc documentation built on Oct. 17, 2022, 9:06 a.m.