getSepProj | R Documentation |
Optimal projection direction and corresponding separation index for pairs of clusters.
getSepProjTheory(
muMat,
SigmaArray,
iniProjDirMethod = c("SL", "naive"),
projDirMethod = c("newton", "fixedpoint"),
alpha = 0.05,
ITMAX = 20,
eps = 1.0e-10,
quiet = TRUE)
getSepProjData(
y,
cl,
iniProjDirMethod = c("SL", "naive"),
projDirMethod = c("newton", "fixedpoint"),
alpha = 0.05,
ITMAX = 20,
eps = 1.0e-10,
quiet = TRUE)
muMat |
Matrix of mean vectors. Rows correspond to mean vectors for clusters. |
SigmaArray |
Array of covariance matrices. |
y |
Data matrix. Rows correspond to observations. Columns correspond to variables. |
cl |
Cluster membership vector. |
iniProjDirMethod |
Indicating the method to get initial projection direction when calculating
the separation index between a pair of clusters (c.f. Qiu and Joe,
2006a, 2006b). |
projDirMethod |
Indicating the method to get the optimal projection direction when calculating
the separation index between a pair of clusters (c.f. Qiu and Joe,
2006a, 2006b). |
alpha |
Tuning parameter reflecting the percentage in the two
tails of a projected cluster that might be outlying.
We set |
ITMAX |
Maximum iteration allowed when to iteratively calculate the optimal projection direction. The actual number of iterations is usually much less than the default value 20. |
eps |
Convergence threshold. A small positive number to check if a quantitiy
|
quiet |
A flag to switch on/off the outputs of intermediate results and/or possible warning messages. The default value is |
When calculating the optimal projection direction and corresponding optimal
separation index for a pair of cluster, if one or both cluster covariance
matrices is/are singular, the ‘newton’ method can not be used.
In this case, the functions getSepProjTheory
and getSepProjData
will automatically use the ‘fixedpoint’ method to search the optimal
projection direction, even if the user specifies the value of the argument
projDirMethod
as ‘newton’. Also, multiple initial projection
directions will be evaluated.
Specifically, 2+2p
projection directions will be evaluated. The first
projection direction is the “naive” direction
\boldsymbol{\mu}_2-\boldsymbol{\mu}_1
.
The second projection direction is the “SL” projection direction
\left(\boldsymbol{\Sigma}_1+\boldsymbol{\Sigma}_2\right)^{-1}
\left(\boldsymbol{\mu}_2-\boldsymbol{\mu}_1\right)
.
The next p
projection directions are the p
eigenvectors of the covariance
matrix of the first cluster. The remaining p
projection directions are
the p
eigenvectors of the covariance matrix of the second cluster.
Each of these 2+2*p
projection directions are in turn used as the initial
projection direction for the ‘fixedpoint’ algorithm to obtain the
optimal projection direction and the corresponding optimal separation index.
We also obtain 2+2*p
separation indices by projecting two clusters along each of these 2+2*p
projection directions.
Finally, the projection direction with the largest separation index among the
2*(2+2*p)
optimal separation indices is chosen as the optimal projection
direction. The corresponding separation index is chosen as the optimal
separation index.
sepValMat |
Separation index matrix |
projDirArray |
Array of projection directions for each pair of clusters |
Weiliang Qiu weiliang.qiu@gmail.com
Harry Joe harry@stat.ubc.ca
Qiu, W.-L. and Joe, H. (2006a) Generation of Random Clusters with Specified Degree of Separaion. Journal of Classification, 23(2), 315-334.
Qiu, W.-L. and Joe, H. (2006b) Separation Index and Partial Membership for Clustering. Computational Statistics and Data Analysis, 50, 585–603.
Su, J. Q. and Liu, J. S. (1993) Linear Combinations of Multiple Diagnostic Markers. Journal of the American Statistical Association, 88, 1350–1355.
n1 <- 50
mu1 <- c(0, 0)
Sigma1 <- matrix(c(2, 1, 1, 5), 2, 2)
n2 <- 100
mu2 <- c(10, 0)
Sigma2 <- matrix(c(5, -1, -1, 2), 2, 2)
projDir <- c(1, 0)
muMat <- rbind(mu1, mu2)
SigmaArray <- array(0, c(2, 2, 2))
SigmaArray[, , 1] <- Sigma1
SigmaArray[, , 2] <- Sigma2
a <- getSepProjTheory(
muMat = muMat,
SigmaArray = SigmaArray,
iniProjDirMethod = "SL")
# separation index for cluster distributions 1 and 2
a$sepValMat[1, 2]
# projection direction for cluster distributions 1 and 2
a$projDirArray[1, 2, ]
library(MASS)
y1 <- mvrnorm(n1, mu1, Sigma1)
y2 <- mvrnorm(n2, mu2, Sigma2)
y <- rbind(y1, y2)
cl <- rep(1:2, c(n1, n2))
b <- getSepProjData(
y = y,
cl = cl,
iniProjDirMethod = "SL",
projDirMethod = "newton")
# separation index for clusters 1 and 2
b$sepValMat[1, 2]
# projection direction for clusters 1 and 2
b$projDirArray[1, 2, ]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.