getSepProj: OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION...
In clusterGeneration: Random Cluster Generation (with Specified Degree of Separation)

getSepProj

R Documentation

OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION INDEX FOR PAIRS OF CLUSTERS

Description

Optimal projection direction and corresponding separation index for pairs of clusters.

Usage

getSepProjTheory(
		 muMat, 
		 SigmaArray, 
                 iniProjDirMethod = c("SL", "naive"), 
                 projDirMethod = c("newton", "fixedpoint"), 
                 alpha = 0.05, 
		 ITMAX = 20, 
		 eps = 1.0e-10, 
		 quiet = TRUE)

getSepProjData(
	       y, 
	       cl, 
               iniProjDirMethod = c("SL", "naive"), 
               projDirMethod = c("newton", "fixedpoint"), 
               alpha = 0.05, 
	       ITMAX = 20, 
	       eps = 1.0e-10, 
	       quiet = TRUE)

Arguments

`muMat`	Matrix of mean vectors. Rows correspond to mean vectors for clusters.
`SigmaArray`	Array of covariance matrices. `SigmaArray[,,i]` record the covariance matrix of the `i`-th cluster.
`y`	Data matrix. Rows correspond to observations. Columns correspond to variables.
`cl`	Cluster membership vector.
`iniProjDirMethod`	Indicating the method to get initial projection direction when calculating the separation index between a pair of clusters (c.f. Qiu and Joe, 2006a, 2006b). `iniProjDirMethod`=“SL” indicates the initial projection direction is the sample version of the SL's projection direction (Su and Liu, 1993) `\left(\boldsymbol{\Sigma}_1+\boldsymbol{\Sigma}_2\right)^{-1}\left(\boldsymbol{\mu}_2-\boldsymbol{\mu}_1\right)` `iniProjDirMethod`=“naive” indicates the initial projection direction is `\boldsymbol{\mu}_2-\boldsymbol{\mu}_1`
`projDirMethod`	Indicating the method to get the optimal projection direction when calculating the separation index between a pair of clusters (c.f. Qiu and Joe, 2006a, 2006b). `projDirMethod`=“newton” indicates we use the Newton-Raphson method to search the optimal projection direction (c.f. Qiu and Joe, 2006a). This requires the assumptions that both covariance matrices of the pair of clusters are positive-definite. If this assumption is violated, the “fixedpoint” method could be used. The “fixedpoint” method iteratively searches the optimal projection direction based on the first derivative of the separation index to the project direction (c.f. Qiu and Joe, 2006b).
`alpha`	Tuning parameter reflecting the percentage in the two tails of a projected cluster that might be outlying. We set `alpha=0.05` like we set the significance level in hypothesis testing as `0.05`.
`ITMAX`	Maximum iteration allowed when to iteratively calculate the optimal projection direction. The actual number of iterations is usually much less than the default value 20.
`eps`	Convergence threshold. A small positive number to check if a quantitiy `q` is equal to zero. If `\|q\|<eps`, then we regard `q` as equal to zero. `eps` is used to check if an algorithm converges. The default value is `1.0e-10`.
`quiet`	A flag to switch on/off the outputs of intermediate results and/or possible warning messages. The default value is `TRUE`.

Details

When calculating the optimal projection direction and corresponding optimal separation index for a pair of cluster, if one or both cluster covariance matrices is/are singular, the ‘newton’ method can not be used. In this case, the functions getSepProjTheory and getSepProjData will automatically use the ‘fixedpoint’ method to search the optimal projection direction, even if the user specifies the value of the argument projDirMethod as ‘newton’. Also, multiple initial projection directions will be evaluated.

Specifically, 2+2p projection directions will be evaluated. The first projection direction is the “naive” direction \boldsymbol{\mu}_2-\boldsymbol{\mu}_1. The second projection direction is the “SL” projection direction \left(\boldsymbol{\Sigma}_1+\boldsymbol{\Sigma}_2\right)^{-1} \left(\boldsymbol{\mu}_2-\boldsymbol{\mu}_1\right). The next p projection directions are the p eigenvectors of the covariance matrix of the first cluster. The remaining p projection directions are the p eigenvectors of the covariance matrix of the second cluster.

Each of these 2+2*p projection directions are in turn used as the initial projection direction for the ‘fixedpoint’ algorithm to obtain the optimal projection direction and the corresponding optimal separation index. We also obtain 2+2*p separation indices by projecting two clusters along each of these 2+2*p projection directions.

Finally, the projection direction with the largest separation index among the 2*(2+2*p) optimal separation indices is chosen as the optimal projection direction. The corresponding separation index is chosen as the optimal separation index.

Value

`sepValMat`	Separation index matrix
`projDirArray`	Array of projection directions for each pair of clusters

Author(s)

Weiliang Qiu weiliang.qiu@gmail.com
Harry Joe harry@stat.ubc.ca

References

Qiu, W.-L. and Joe, H. (2006a) Generation of Random Clusters with Specified Degree of Separaion. Journal of Classification, 23(2), 315-334.

Qiu, W.-L. and Joe, H. (2006b) Separation Index and Partial Membership for Clustering. Computational Statistics and Data Analysis, 50, 585–603.

Su, J. Q. and Liu, J. S. (1993) Linear Combinations of Multiple Diagnostic Markers. Journal of the American Statistical Association, 88, 1350–1355.

Examples

n1 <- 50
mu1 <- c(0, 0)
Sigma1 <- matrix(c(2, 1, 1, 5), 2, 2)
n2 <- 100
mu2 <- c(10, 0)
Sigma2 <- matrix(c(5, -1, -1, 2), 2, 2)
projDir <- c(1,  0)
muMat <- rbind(mu1,  mu2)
SigmaArray <- array(0,  c(2, 2, 2))
SigmaArray[, , 1] <- Sigma1
SigmaArray[, , 2] <- Sigma2

a <- getSepProjTheory(
		    muMat = muMat, 
		    SigmaArray = SigmaArray, 
		    iniProjDirMethod = "SL")
# separation index for cluster distributions 1 and 2
a$sepValMat[1, 2]
# projection direction for cluster distributions 1 and 2
a$projDirArray[1, 2, ]

library(MASS)
y1 <- mvrnorm(n1, mu1, Sigma1)
y2 <- mvrnorm(n2, mu2, Sigma2)
y <- rbind(y1, y2)
cl <- rep(1:2, c(n1, n2))

b <- getSepProjData(
		  y = y, 
		  cl = cl, 
		  iniProjDirMethod = "SL", 
		  projDirMethod = "newton")
# separation index for clusters 1 and 2
b$sepValMat[1, 2]
# projection direction for clusters 1 and 2
b$projDirArray[1, 2, ]

clusterGeneration documentation built on Aug. 16, 2023, 9:07 a.m.

clusterGeneration index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

clusterGeneration
Random Cluster Generation (with Specified Degree of Separation)

getSepProj: OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION...
In clusterGeneration: Random Cluster Generation (with Specified Degree of Separation)

OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION INDEX FOR PAIRS OF CLUSTERS

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to getSepProj in clusterGeneration...

R Package Documentation

Browse R Packages

We want your feedback!

clusterGeneration Random Cluster Generation (with Specified Degree of Separation)

getSepProj: OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION... In clusterGeneration: Random Cluster Generation (with Specified Degree of Separation)

OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION INDEX FOR PAIRS OF CLUSTERS

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to getSepProj in clusterGeneration...

R Package Documentation

Browse R Packages

We want your feedback!

clusterGeneration
Random Cluster Generation (with Specified Degree of Separation)

getSepProj: OPTIMAL PROJECTION DIRECTION AND CORRESPONDING SEPARATION...
In clusterGeneration: Random Cluster Generation (with Specified Degree of Separation)