geva.dcluster: GEVA Density Clustering
In sbcblab/geva: Gene Expression Variation Analysis (GEVA)

Description Usage Arguments Details Value Note See Also Examples

Performs a density cluster analysis from summarized data.

geva.dcluster(
  sv,
  resolution = 0.3,
  dcluster.method = options.dcluster.method,
  cl.score.method = options.cl.score.method,
  minpts = 2,
  ...,
  eps = NA_real_,
  include.raw.results = FALSE
)

options.dcluster.method
# c("dbscan", "optics")

`sv`	a `numeric` `SVTable` object (usually `GEVASummary`)
`resolution`	`numeric` (`0` to `1`), used as a "zoom" parameter for cluster detection. A zero value returns the minimum number of clusters that can detected, while `1` returns the maximum amount of detectable clusters. Ignored if `eps` is specified
`dcluster.method`	`character`, density-based method for cluster separation
`cl.score.method`	`character`, method used to calculate the cluster scores for each point. If `"auto"`, the `"density"` method is selected
`minpts`	`integer`, minimum number of points required to form a cluster
`...`	additional arguments. Accepts `verbose` (`logical`, default is `TRUE`) to enable or disable printing the current progress
`eps`	`numeric`, maximum neighborhood distance between points to be clustered
`include.raw.results`	`logical`, whether to attach intermediate results to the returned object

This function performs a density cluster analysis with the aid of implemented methods from the dbscan::dbscan package. The available methods for the dcluster.method arguments are "dbscan" and "options", which internally call dbscan::dbscan() and dbscan::optics(), respectively.

The resolution value is an accessible way to define the cluster separation threshold used in density clustering. The DBSCAN algorithm uses an epsilon value that represents the minimum distance of separation, and resolution translates a value between 0 and 1 to a propotional value within the acceptable range of epsilon values. This allows defining the rate of clusters from 0 to 1, which results in the least number of possible clusters for 0 and the highest number for 1. Nevertheless, if epsilon is specified as eps in the optinal arguments, its value is used and resolution is ignored.

The cl.score.method argument defines how scores are calculated for each SV point (row in sv) that was assigned to a cluster, (i.e., excluding non-clustered points). If specified as "auto", the parameter will be selected based on the rate of neighbor points ("density").

If include.raw.results is TRUE, some aditional data will be attached to the info slot of the returned GEVACluster objects, including the kNN tree generated during the intermediate steps.

A GEVACluster object

In density clustering, only the most dense points are clustered. For the unclustered points, the grouping value is set to NA.

Other geva.cluster: geva.cluster(), geva.hcluster(), geva.quantiles()

## Density clustering from a randomly generated input 

# Preparing the data
ginput <- geva.ideal.example()      # Generates a random input example
gsummary <- geva.summarize(ginput)  # Summarizes with the default parameters

# Density clustering
gclust <- geva.dcluster(gsummary)
plot(gclust)

# Density clustering with slightly more resolution
gclust <- geva.dcluster(gsummary, resolution=0.35)
plot(gclust)