Description Usage Arguments Details Value Author(s) References See Also Examples

Function to perform local classification where the subclasses are concentrated in different subspaces of the data.

1 2 3 4 5 6 |

`x` |
A matrix or data frame containing the explanatory variables. The method is restricted to numerical data. |

`grouping` |
A factor specifying the class for each observation. |

`formula` |
A formula of the form |

`data` |
Data frame from which variables specified in formula are to be taken. |

`k` |
Prespecifies the final number of clusters. |

`l` |
Prespecifies the dimension of the final cluster-specific subspaces (equal for all clusters). |

`k0` |
Initial number of clusters (that are computed in the entire data space). Must be greater than |

`a` |
Prespecified factor for the cluster number reduction in each iteration step of the algorithm. |

`prior` |
Argument for optional specification of class prior probabilities if different from the relative class frequencies. |

`inner.loops` |
Number of repetitive iterations (i.e. recomputation of clustering and cluster-specific subspaces) while the number of clusters and the subspace dimension are kept constant. |

`predict.train` |
Character pecifying whether prediction of training data should be pursued. If |

`verbose` |
Logical indicating whether the iteration process sould be displayed. |

`...` |
Currently not used. |

For each cluster the class distribution is computed.

Returns an object of class `orclass`

.

`orclus.res` |
Object of class |

`cluster.posteriors` |
Matrix of clusterwise class posterior probabilities where clusters are rows and classes are coloumns. |

`cluster.priors` |
Vector of relative cluster frequencies weighted by class priors. |

`purity` |
Statistics indicating the discriminability of the identified clusters. |

`prior` |
Vector of class prior probabilities. |

`predict.train` |
Prediction of training data if specified. |

`orclass.call` |
(Matched) function call. |

Gero Szepannek

Aggarwal, C. and Yu, P. (2000): *Finding generalized projected clusters in high dimensional spaces*,
Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 70-81.

`predict.orclass`

, `orclus`

, `predict.orclus`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | ```
# definition of a function for parameterized data simulation
sim.orclus <- function(k = 3, nk = 100, d = 10, l = 4,
sd.cl = 0.05, sd.rest = 1, locshift = 1){
### input parameters for data generation
# k number of clusters
# nk observations per cluster
# d original dimension of the data
# l subspace dimension where the clusters are concentrated
# sd.cl (within cluster subspace) standard deviations for data generation
# sd.rest standard deviations in the remaining space
# locshift parameter of a uniform distribution to sample different cluster means
x <- NULL
for(i in 1:k){
# cluster centers
apts <- locshift*matrix(runif(l*k), ncol = l)
# sample points in original space
xi.original <- cbind(matrix(rnorm(nk * l, sd = sd.cl), ncol=l) + matrix(rep(apts[i,], nk),
ncol = l, byrow = TRUE),
matrix(rnorm(nk * (d-l), sd = sd.rest), ncol = (d-l)))
# subspace generation
sym.mat <- matrix(nrow=d, ncol=d)
for(m in 1:d){
for(n in 1:m){
sym.mat[m,n] <- sym.mat[n,m] <- runif(1)
}
}
subspace <- eigen(sym.mat)$vectors
# transformation
xi.transformed <- xi.original %*% subspace
x <- rbind(x, xi.transformed)
}
clids <- rep(1:k, each = nk)
result <- list(x = x, cluster = clids)
return(result)
}
# simulate data of 2 classes where class 1 consists of 2 subclasses
simdata <- sim.orclus(k = 3, nk = 200, d = 15, l = 4,
sd.cl = 0.05, sd.rest = 1, locshift = 1)
x <- simdata$x
y <- c(rep(1,400), rep(2,200))
res <- orclass(x, y, k = 3, l = 4, k0 = 15, a = 0.75)
res
# compare results
table(res$predict.train$class, y)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.