# qkspecc: qkernel spectral Clustering In qkerntool: Q-Kernel-Based and Conditionally Negative Definite Kernel-Based Machine Learning Tools

## Description

A qkernel spectral clustering algorithm. Clustering is performed by embedding the data into the subspace of the eigenvectors of a graph Laplacian matrix.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```## S4 method for signature 'matrix' qkspecc(x,kernel = "rbfbase", qpar = list(sigma = 2, q = 0.9), Nocent=NA, normalize="symmetric", maxk=20, iterations=200, na.action = na.omit, ...) ## S4 method for signature 'cndkernmatrix' qkspecc(x, Nocent=NA, normalize="symmetric", maxk=20,iterations=200, ...) ## S4 method for signature 'qkernmatrix' qkspecc(x, Nocent=NA, normalize="symmetric", maxk=20,iterations=200, ...) ```

## Arguments

 `x` the matrix of data to be clustered or a kernel Matrix of class `qkernmatrix` or `cndkernmatrix`. `kernel` the kernel function used in computing the affinity matrix. This parameter can be set to any function, of class kernel, which computes a kernel function value between two vector arguments. kernlab provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings: `rbfbase` Radial Basis qkernel function "Gaussian" `nonlbase` Non Linear qkernel function `laplbase` Laplbase qkernel function `ratibase` Rational Quadratic qkernel function `multbase` Multiquadric qkernel function `invbase` Inverse Multiquadric qkernel function `wavbase` Wave qkernel function `powbase` d qkernel function `logbase` Log qkernel function `caubase` Cauchy qkernel function `chibase` Chi-Square qkernel function `studbase` Generalized T-Student qkernel function `nonlcnd` Non Linear cndkernel function `polycnd` Polynomial cndkernel function `rbfcnd` Radial Basis cndkernel function "Gaussian" `laplcnd` Laplacian cndkernel function `anocnd` ANOVA cndkernel function `raticnd` Rational Quadratic cndkernel function `multcnd` Multiquadric cndkernel function `invcnd` Inverse Multiquadric cndkernel function `wavcnd` Wave cndkernel function `powcnd` d cndkernel function `logcnd` Log cndkernel function `caucnd` Cauchy cndkernel function `chicnd` Chi-Square cndkernel function `studcnd` Generalized T-Student cndkernel function The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument. `qpar` a character string or the list of hyper-parameters (kernel parameters). The default character string `list(sigma = 2, q = 0.9)` uses a heuristic to determine a suitable value for the width parameter of the RBF kernel. The second option `"local"` (local scaling) uses a more advanced heuristic and sets a width parameter for every point in the data set. This is particularly useful when the data incorporates multiple scales. A list can also be used containing the parameters to be used with the kernel function. Valid parameters for existing kernels are : `sigma, q` for the Radial Basis qkernel function "rbfbase" , the Laplacian qkernel function "laplbase" and the Cauchy qkernel function "caubase". `alpha, q` for the Non Linear qkernel function "nonlbase". `c, q` for the Rational Quadratic qkernel function "ratibase" , the Multiquadric qkernel function "multbase" and the Inverse Multiquadric qkernel function "invbase". `theta, q` for the Wave qkernel function "wavbase". `d, q` for the d qkernel function "powbase" , the Log qkernel function "logbase" and the Generalized T-Student qkernel function "studbase". `alpha` for the Non Linear cndkernel function "nonlcnd". `d, alpha, c` for the Polynomial cndkernel function "polycnd". `gamma` for the Radial Basis cndkernel function "rbfcnd" and the Laplacian cndkernel function "laplcnd" and the Cauchy cndkernel function "caucnd". `d, sigma` for the ANOVA cndkernel function "anocnd". `c` for the Rational Quadratic cndkernel function "raticnd" , the Multiquadric cndkernel function "multcnd" and the Inverse Multiquadric cndkernel function "invcnd". `theta` for the Wave cndkernel function "wavcnd". `d` for the d cndkernel function "powcnd" , the Log cndkernel function "logcnd" and the Generalized T-Student cndkernel function "studcnd". where length is the length of the strings considered, lambda the decay factor and normalized a logical parameter determining if the kernel evaluations should be normalized. Hyper-parameters for user defined kernels can be passed through the qpar parameter as well. `Nocent` the number of clusters. `normalize` Normalisation of the Laplacian ("none", "symmetric" or "random-walk"). `maxk` If k is NA, an upper bound for the automatic estimation. Defaults to 20. `iterations` the maximum number of iterations allowed. `na.action` the action to perform on NA. `...` additional parameters.

## Details

The qkernel spectral clustering works by embedding the data points of the partitioning problem into the subspace of the eigenvectors corresponding to the k smallest eigenvalues of the graph Laplacian matrix. Using a simple clustering method like `kmeans` on the embedded points usually leads to good performance. It can be shown that qkernel spectral clustering methods boil down to graph partitioning.
The data can be passed to the `qkspecc` function in a `matrix`, in addition `qkspecc` also supports input in the form of a kernel matrix of class `qkernmatrix` or `cndkernmatrix`.

## Value

An S4 object of class `qkspecc` which extends the class `vector` containing integers indicating the cluster to which each point is allocated. The following slots contain useful information

 `clust` The cluster assignments `eVec` The corresponding eigenvector `eVal` The corresponding eigenvalues `ymatrix` The eigenvectors corresponding to the k smallest eigenvalues of the graph Laplacian matrix.

## Author(s)

Yusen Zhang
yusenzhang@126.com

## References

Andrew Y. Ng, Michael I. Jordan, Yair Weiss
On Spectral Clustering: Analysis and an Algorithm
Neural Information Processing Symposium 2001

`qkernmatrix`, `cndkernmatrix`, `qkpca`
 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```data("iris") x=as.matrix(iris[,-5]) qspe <- qkspecc(x,kernel = "rbfbase", qpar = list(sigma = 10, q = 0.9), Nocent=3, normalize="symmetric", maxk=15, iterations=1200) plot(x, col = clust(qspe)) qkfunc <- nonlbase(alpha=1/15,q=0.8) Ktrain <- qkernmatrix(qkfunc, x) qspe <- qkspecc(Ktrain, Nocent=3, normalize="symmetric", maxk=20) plot(x, col = clust(qspe)) ```