# fpcac: Functional Principal Components Analysis Clustering In clustEff: Clusters of Effects Curves in Quantile Regression Models

 fpcac R Documentation

## Functional Principal Components Analysis Clustering

### Description

This function implements the algorithm FPCAC for curves clustering as a variant of a k-means algorithm based on the principal component rotation of data

### Usage

fpcac(X, K = 2, fd = NULL, nbasis = 5, norder = 3, nharmonics = 3,
alpha = 0, niter = 30, Ksteps = 25, conf.level = 0.9, seed, disp = FALSE)


### Arguments

 X Matrix of ‘curves’ of dimension n x q. K the number of clusters. fd If not NULL it overrides X and must be an object of class fd. nbasis an integer variable specifying the number of basis functions. The default value is 5. norder an integer specifying the order of b-splines, which is one higher than their degree. The default value is 3. nharmonics the number of harmonics or principal components to use. The default value is 3. alpha trimming size, that is the given proportion of observations to be discarded. niter the number or random restarting (larger values provide more accurate solutions. Ksteps the number of k-mean steps (not too many ksteps are needed). conf.level the confidence level required. seed the seed used for reproducibility. disp if TRUE, it is used to print some information across the algorithm.

### Details

FPCAC is a functional PCA-based clustering approach that provides a variation of the algorithm for curves clustering proposed by Garcia-Escudero and Gordaliza (2005).

The starting point of the proposed FPCAC is to find a linear approximation of each curve by a finite $p$ dimensional vector of coefficients defined by the FPCA scores.

The number of starting clusters k is obtained on the basis of the scores volume, such that we assign events to the clusters defined by events that have a distance less than a fixed threshold (e.g. 90-th percentile) in the space of PCA scores. Once k is obtained we use a modified version of the trimmed k-means algorithm, that considers the matrix of FPCA scores instead of the coefficients of a linear fitting to B-spline bases.

The trimmed k-means clustering algorithm looks for the k centers C_1, ..., C_k that are solution of the minimization problem:

O_k(α)=\min_Y \min_{C_1, \cdots, C_k} \frac{1}{[n(1-α)]} ∑_{X_i \in Y} \inf_{1≤q j ≤q k} || X_i- C_j||^2

We think that the proposed approach has the advantage of an immediate use of PCA for functional data avoiding some objective choices related to spline fitting as in RCC. Simulations and applications suggest also the well behavior of the FPCAC algorithm, both in terms of stable and easily interpretable results.

### Value

An object of class “fpcac”, a list containing the following items:

 call the matched call. obj.function The percentiles used in the quantile regression coefficient modeling or objective function O_k(\alpha). centers The curves matrix. radius The vector of clusters. clusters The mean curves matrix of dimension n x k. Xorig The atrix of ‘curves’ of dimension n x q. fd The object obtained by the call of FPCA of class ‘fd’ X The matrix of ‘curves’ transformed through FPCA of dimension p x nharmonics. X.mean The mean curves matrix of dimension n x k. diss.matrix The Euclidean distance matrix of the transformed curves. oggSilhouette An object of class ‘silhouette’.

### Author(s)

Gianluca Sottile gianluca.sottile@unipa.it

### References

Adelfio, G., Chiodi, M., D'Alessandro, A. and Luzio, D. (2011) FPCA algorithm for waveform clustering. Journal of Communication and Computer, 8(6), 494-502.

Adelfio, G., Chiodi, M., D'Alessandro, A., Luzio, D., D'Anna, G., Mangano, G. (2012) Simultaneous seismic wave clustering and registration. Computers & Geosciences 44, 60-69.

Garcia-Escudero, L. A. and Gordaliza, A. (2005). A proposal for robust curve clustering, Journal of classification, 22, 185-201.

opt.fpcac.

### Examples

set.seed(1234)
n <- 300
x <- 1:n/n

Y <- matrix(0, n, 30)

sigma2 <- 4*pmax(x-.2, 0) - 8*pmax(x-.5, 0) + 4*pmax(x-.8, 0)

mu <- sin(3*pi*x)
for(i in 1:10) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))

mu <- cos(3*pi*x)
for(i in 11:23) Y[,i] <- mu + rnorm(length(x), 0, pmax(sigma2,0))

mu <- sin(3*pi*x)*cos(pi*x)
for(i in 24:28) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))

mu <- 0 #sin(1/3*pi*x)*cos(2*pi*x)
for(i in 29:30) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))

obj <- fpcac(Y, K = 4, disp = FALSE)
obj


clustEff documentation built on June 28, 2022, 5:06 p.m.