Description Usage Arguments Value Methods Author(s) Examples
After performing a pre-merging step so that all clusters have a minimum size, semiparametric bayesian density is estimated using a Dirichlet process mixture of normals. This is used both to compute bayesian mis-classification posterior probabilities (correct classification rates) and to estimate probability contours which can be visualized on the MDS map.
The functions contour2dDP
and plotContour
functions can be used to compute bayesian density estimates for a given set of elements (points)
from a pre-generated 2D MDS object. These functions are used internally by clusGPS to draw cluster contours but are also useful to visualize other type of
contours over the map (ie genes from a given Gene Ontology term, having
a specific epigenetic mark of interest, etc).
The S4 accessors clusNames,tabClusters
and clusterID
retrieve information stored within a clusGPS
object.
1 2 3 4 5 6 7 8 9 10 | clusGPS(d, m, h, sel=NULL, id=NULL, grid, ngrid=1000, densgrid=FALSE, preMerge=TRUE, type = "hclust", method =
"average", samplesize = 1, p.adjust = TRUE, k, mc.cores = 1,
set.seed = 149, verbose=TRUE, minpoints=70,...)
contour2dDP(x, ngrid, grid = NULL, probContour = 0.5, xlim, ylim,
labels = "", labcex = 0.01, col = colors()[393], lwd = 4,
lty = 1, contour.type = "single", contour.fill = FALSE,
minpoints=100, ...)
clusNames(clus)
tabClusters(clus,name)
clusterID(clus,name)
|
d |
Object of class |
m |
(Optional). Object of class |
h |
(Optional). Object of class |
sel |
(Optional). Logical vector indicating which elements from |
id |
(Optional). Label of the cluster which we want to further subdivide, ignoring points from all other clusters. Deprecated, use parameter |
grid |
Matrix of dimension ngrid*nvar giving the diagonal points of the grid where the density estimate is evaluated. The default value is NULL: grid dimensions are chosen according to the range of the data, and granularity is automatically determined according to data density, in order to provide a more accurate estimation in high density areas, where more resolution is needed. |
ngrid |
Number of grid points where the density estimate is evaluated. This argument is ignored if a grid is specified. The default value is 1000. Higher values are recommended if data presents very high density areas. |
densgrid |
Set to true to generate grid points from the quantile distribution of the data using the grid size defined by ngrid. This is useful if the data presents areas of very different density, ranging from very sparse to extremely dense areas, optimizing grid granularity where is necessary, therefore improving resolution of density estimation and reducing computation time. |
preMerge |
If TRUE will perform a first pre-merging step so that any cluster smaller than |
type |
Type of clustering to be performed. Currently only "hclust"
(Agglomerative Nesting) is supported, but any other clustering type
can be used by providing a pre-calculated object |
method |
Clustering method. See |
samplesize |
Proportion of elements to sample for computing clustering and density estimation. This is useful to generate density contours from a subset of the data, speeding up computation. |
p.adjust |
Set to TRUE to adjust the bayesian posterior probabilities of mis-classification. |
k |
Integer vector indicating the number of clusters on which density estimation will be computed for mis-classification or contour calculation. |
mc.cores |
Number of cores to be used for parallel computation with the
|
set.seed |
If samplesize<1, random seed to be used to perform random sampling of the data. |
verbose |
Set to TRUE to output clustering process information. |
minpoints |
If preMerge is FALSE, then the algorithm will ignore clusters with fewer than |
x |
Numeric matrix indicating coordinates of the points for which a probability contour is calculated in contour2dDP. |
probContour |
Numeric matrix indicating coordinates of the points for which a probability contour is calculated in contour2dDP. |
contour.type |
For contour2dDP, type of contour, either 'single' (surrounding the points within the given probContour probability) or 'multiple' to generate terrain-like density contour lines. |
contour.fill |
Deprecated. |
xlim,ylim,labels,labcex,col,lwd,lty |
Graphical parameters given to contour2dDP. |
clus |
A valid |
name |
Character indicating a valid name within a |
... |
Additional parameters. |
The function clusGPS
returns an object of class
clusGPS
. See help for clusGPS-methods
for
details. contour2dDP
returns a DPdensity
object with
density contour information which can be plotted as 2D contours with our
plotContour
function, as well as with the plot
function
from the DPpackage
package.
Hierarchical clustering is
performed for the elements whose pairwise distances are given in
d
. For each cluster partition given in k
, cluster
identity for each element is returned, and semiparametric
bayesian density estimation is computed using the point density
information from m
.
signature(m = "clusGPS")
: S4 plot method for
clusGPS
objects.
signature(m = "clusGPS")
: Retrieves names of the
clustering configurations stored in clusGPS
objects, one for
each distance threshold indicated in k
, that get automatically named accordingly.
signature(m = "clusGPS")
: Returns a table with
the number of elements in each of the clusters found for an existing
clustering configuration with name name
within the clusGPS
object.
signature(m = "clusGPS")
: Returns a vector of
cluster assignments for all the elements in an existing
clustering configuration name
within the clusGPS
object.
Oscar Reina
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # Not run
# data(s2)
# # Computing distances
# d <- distGPS(s2.tab,metric='tanimoto',uniqueRows=TRUE)
# # Creating MDS object
# mds1 <- mds(d,type='isoMDS')
# mds1
# plot(mds1)
# Precomputing clustering
# h <- hclust(as.dist(d@d),method='average')
# # Calculating densities (contours and probabilities), takes a while
# clus <- clusGPS(d,mds1,preMerge=TRUE,k=max(cutree(h,h=0.5)))
# # clus contains information for contours and probabilities
# plot(clus,type='contours',k=125,lwd=3,probContour=.75)
# plot(clus,type='stats',k=125,ylim=c(0,1))
# plot(clus,type='avgstat')
# plot(clus,type='density',k=3,ask=TRUE,xlim=range(mds1@points),ylim=range(mds1@points))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.