Description Usage Arguments Value Author(s) References
CDpca performs a clustering and disjoint principal components analysis (CDPCA) on the given numeric data matrix and returns a list of results Given a (IxJ) real data matrix X = [xij], the CDPCA methodology is allowed to cluster the I objects into P nonempty and nonoverlapping clusters Cp, p = 1,...,P, which are identified by theirs centroids, and, simultaneously, to partitioning the J attributes into Q disjoint components, PCq, q = 1,...,Q. The CDpca function models X estimating the parameter of the model using an Alternating Least Square (ALS) procedure originally proposed by Vichi and Saport (2009) and described in two steps by Macedo and Freitas (2015).
1 |
data |
A numeric matrix or data frame which provides the data for the CDPCA |
class |
A numeric vector containing the real classification of the objects in the data, or NULL if the class of objects is unknown |
P |
An integer value indicating the number of clusters of objects |
Q |
An integer value indicating the number of clusters of variables |
SDPinitial |
A logical value indicating whether the initial assignment matrices U and V are randomly generated (by default) or an algorithmic framework based on a semidefinite programming approach is preferred (TRUE) |
tol |
A positive (low) value indicating the maximum term for the difference between two consecutives values of the objective function. A tolerance value of 10^(-5) is indicated by default |
maxit |
The maximum number of iterations of one run of the ALS algorithm |
r |
Number of runs of the ALS algorithm for the final solution |
cdpcaplot |
A logical value indicating whether an additional graphic is created (showing the data projected on the first two CDPCA principal components) |
Cdpca returns a list of results containing the following components:
Iter |
The total number of iterations used in the best loop for computing the best solution |
loop |
The best loop number |
timebestloop |
The computation time on the best loop |
timeallloops |
The computation time for all loops |
Y |
The component score matrix |
Ybar |
The object centroids matrix in the reduced space |
A |
The component loading matrix |
U |
The partition of objects |
V |
The partition of variables |
F |
The value of the objective function to maximize |
bcdev |
The between cluster deviance |
bcdevTotal |
The between cluster deviance over the total variability |
tableclass |
The cdpca classification |
pseudocm |
The pseudo confusion matrix concerning the true (given by class) and cdpca classifications |
Enorm |
The error norm for the obtained cdpca model |
Eloisa Macedo macedo@ua.pt, Adelaide Freitas adelaide@ua.pt, Maurizio Vichi maurizio.vichi@uniroma1.it
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.