View source: R/cluster_spectra.R
cluster_spectra | R Documentation |
Cluster peaks by spectral similarity.
cluster_spectra(
peak_table,
peak_no = NULL,
alpha = 0.05,
min_size = 5,
max_size = NULL,
nboot = 1000,
plot_dend = TRUE,
plot_spectra = TRUE,
verbose = getOption("verbose"),
save = FALSE,
parallel = TRUE,
max.only = FALSE,
output = c("pvclust", "clusters"),
...
)
peak_table |
Peak table from |
peak_no |
Minimum and maximum thresholds for the number of peaks a
cluster may have. This argument is deprecated in favor of |
alpha |
Confidence threshold for inclusion of cluster. |
min_size |
Minimum number of peaks a cluster may have. |
max_size |
Maximum number of peaks a cluster may have. |
nboot |
Number of bootstrap replicates for
|
plot_dend |
Logical. If TRUE, plots dendrogram with bootstrap values. |
plot_spectra |
Logical. If TRUE, plots overlapping spectra for each cluster. |
verbose |
Logical. If TRUE, prints progress report to console. |
save |
Logical. If TRUE, saves pvclust object to current directory. |
parallel |
Logical. If TRUE, use parallel processing for
|
max.only |
Logical. If TRUE, returns only highest level for nested dendrograms. |
output |
What to return. Either |
... |
Additional arguments to |
Function to cluster peaks by spectral similarity. Before using this function,
reference spectra must be attached to the peak_table
using the
attach_ref_spectra
function. These reference spectra are then used to
construct a distance matrix based on spectral similarity (pearson correlation)
between peaks. Hierarchical clustering with bootstrap resampling is performed
on the resulting correlation matrix to classify peaks by spectral similarity,
as implemented in pvclust
. Finally, bootstrap
values can be used to select clusters that exceed a certain confidence
threshold as defined by alpha
.
Clusters can be filtered by the minimum and maximum size of the cluster using
the min_size
and max_size
arguments respectively. If
max_only
is TRUE, only the largest cluster in a nested tree of
clusters meeting the specified confidence threshold will be returned.
Returns clusters and/or pvclust
object according to the value
of the output
argument.
If output = clusters
, returns a list of S4 cluster
objects.
If output = pvclust
, returns a pvclust
object.
If output = both
, returns a nested list containing [[1]]
the
pvclust
object, and [[2]]
the list of
S4 cluster
objects.
The cluster
objects consist of the following components:
peaks
: a character vector containing the names
of all peaks contained in the given cluster.
pval
: a numeric vector of length 1 containing
the bootstrap p-value (au) for the given cluster.
Users should be aware that the clustering algorithm will often return nested clusters. Thus, an individual peak could appear in more than one cluster.
It is highly suggested to use more than 100 bootstraps if you run the
clustering algorithm on real data even though we use nboot = 100
in
the example to reduce runtime. The authors of pvclust
suggest
nboot = 10000
.
Ethan Bass
R. Suzuki & H. Shimodaira. 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12):1540-1542. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btl117")}.
data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab, Sa_warp, ref = "max.int")
cl <- cluster_spectra(pk_tab, nboot = 100, max.only = FALSE,
save = FALSE, alpha = 0.03)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.