FSC.fun: Functional Spectral Clustering
In MaryamGlasgowUni/FSC: Functional Spectral Clustering Approach

Description Usage Arguments Details Value Examples

View source: R/FSC.R

This function introduces an approach for clustering functional data based on spectral clustering analysis.

1	FSC.fun(matrix.data, timeline = NULL, basis, nclusters = NULL, d, ...)

`matrix.data`	is the data to be clustered in a matrix format.
`timeline`	is the time interval of the data.
`basis`	is some generated smooth bases appropriate for the data, see example below.
`nclusters`	the number of clusters in the data.
`d`	d can take the value 0 or 1 or 2, depends on the chosen set of curves for clustering the data.
`...`	extra arguments can be passed to the function.

The clustering procedure is straightforward once the smoothing technique is chosen with care. The smoothing stage plays an important role in this technique and can determine the success of the clustering results, see fda pacakge for creating basis functions. While the number of clusters in the data can be determined by different ways, for instance NbClust can be used to estimate the number of clusters of the original data. Also, in some cases the number of clusters is known priori from the data. Finally, regarding the choice of d; usually clustering functional data is based on the original trajectories. However, it has been shown that the first derivatives (rate of change) can sometimes hold more information about the data and accordingly it can detect the similarities/dissimilarities better. Similary the same concept applied for the second derivatives (accelaration).

FSC.fun returns a list of objects

`clusters`	the curves labels according to their clusters.
`clusters.size`	the number of curves in each cluster.
`coefs.fd`	the coefficients of the smoothed curves.
`coefs.deriv1`	the coefficients of the first derivatives.
`coefs.deriv2`	the coefficients of the second derivatives.
`dist.matrix`	the distance matrix of the curves based on the choice of d.

# Apply the FSC technique of the Canadian weather temperature data
y <- CanadianWeather$dailyAv[,,1]
t <- 1:365
# plot the original observations over the timeline
matplot(y, type="l", main=" temperature observations for 35 Canadian cities", xlab="days", ylab="temp C")

# choose smoothing technique for the data y,
bbasis.y <- create.bspline.basis(rangeval= c(0, 365), nbasis = 367, norder= 4)
penfd.y <- fdPar(bbasis.y,Lfdobj=int2Lfd(2),lambda= 10^{4})
smooth.y <- smooth.basis(1:365, y , penfd.y)
# plot the smoothed curves after applying the basis functions
plot(smooth.y ,main="smoothed curves of the daily tempreture curves", xlab="over a year", ylab="temp C")

# find the first and the second derivatives of the smoothed curves:
deriv1.y <- deriv.fd(smooth.y$fd, 1)
deriv2.y <- deriv.fd(smooth.y$fd, 2)
# plot the first and the second derivatives to examine them
par(mfrow=c(1,2))
plot(deriv1.y, main="first derivatives")
plot(deriv2.y, main="second derivatives")

# Apply the clustering technique. According to (Ramsay, and Silverman, 2005), we will assume the number of clusters = 4.

# using original curves for clustering
clust.d0 <- FSC.fun(y, t, penfd.y, 4, 0)
# using first derivatives for clustering
clust.d1 <- FSC.fun(y, t, penfd.y, 4, 1)
# using second derivatives for clustering
clust.d2 <- FSC.fun(y, t, penfd.y, 4, 2)