Description Usage Arguments Details Value See Also Examples
View source: R/clusteringbase.R
Performs a cluster analysis from summarized data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | geva.cluster(
sv,
cluster.method = options.cluster.method,
cl.score.method = options.cl.score.method,
resolution = 0.3,
distance.method = options.distance,
...,
grouped.return = FALSE
)
options.cluster.method
# c("hierarchical", "density", "quantiles")
options.cl.score.method
# c("auto", "hclust.height", "density", "centroid")
options.distance
# c("euclidean", "manhattan")
|
sv |
a |
cluster.method |
|
cl.score.method |
|
resolution |
|
distance.method |
|
... |
further arguments passed to
|
grouped.return |
|
The cluster.method
determines which grouping subroutine is used to classify the summarized data points based on distance and partitioning. Each option has their equivalent functions that can be called directly: "density"
uses geva.dcluster()
; "hierarchical"
uses geva.hcluster()
; and "quantiles"
calls geva.quantiles()
. However, this wrapper function can also be used to join GEVASummary
and GEVAGroupSet
objects into a single GEVAGroupedSummary
object by setting grouped.return
to TRUE
.
The cl.score.method
argument defines how scores are calculated for each SV point (row in sv
) that was assigned to a cluster, (i.e., excluding non-clustered points). If specified as "auto"
, the parameter will be selected based on the cluster.method
: "density"
(rate of neighbor points) for the density method; and "hclust.height"
(local hierarchy height) for the hierarchical method. The "centroid"
method calculates the scores based on the proportional distance between each point to its cluster's centroid. Note that the cl.score.method
argument is ignored if cluster.method
is "quantiles"
, since quantile scores are always based on their local centroid distances.
The resolution
value is a more accessible way to define the cluster separation threshold used in density and hierarchical clustering methods. Density clusters uses an epsilon value that represents the minimum distance of separation, whereas hierarchical clusters are defined by cutting the hierarchy tree wherever there is a minimum distance between two hierarchies. In this sense, resolution
translates a value between 0
and 1
to propotional value for epsilon or hierarchical height (depending on the cluster.method
) that would result in the least number of possible clusters for 0
and the highest number for 1
. Nevertheless, if epsilon is specified as eps
in the optinal arguments, its value is used and resolution
is ignored.
This function produces a GEVAGroupSet
-derived object, particularly a GEVACluster
for the "hierarchical"
and "density"
cluster methods or a GEVAQuantiles
for the "quantiles"
method.
However, if grouped.return
is TRUE
and sv
is a GEVASummary
object, the produced GEVAGroupSet
data will be concatenated to the input and returned as a GEVAGroupedSummary
Other geva.cluster:
geva.dcluster()
,
geva.hcluster()
,
geva.quantiles()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ## Cluster analysis from a randomly generated input
# Preparing the data
ginput <- geva.ideal.example() # Generates a random input example
gsummary <- geva.summarize(ginput) # Summarizes with the default parameters
# Hierarchical clustering
gclust <- geva.cluster(gsummary, cluster.method="hierarchical")
plot(gclust)
# Density clustering
gclust <- geva.cluster(gsummary, cluster.method="density")
plot(gclust)
# Density clustering with slightly more resolution
gclust <- geva.cluster(gsummary,
cluster.method="density",
resolution=0.35)
plot(gclust)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.