cluster_spectra: Cluster peaks by spectral similarity.
In chromatographR: Chromatographic Data Analysis Toolset

cluster_spectra

R Documentation

Cluster peaks by spectral similarity.

Description

Function to cluster peaks by spectral similarity. A representative spectrum is selected for each peak in the provided peak table and used to construct a distance matrix based on spectral similarity (pearson correlation) between peaks. Hierarchical clustering with bootstrap resampling is performed on the resulting correlation matrix to classify peaks into by their spectral similarity.

Usage

cluster_spectra(
  peak_table,
  chrom_list,
  peak_no = c(5, 100),
  alpha = 0.95,
  nboot = 1000,
  plot_dend = TRUE,
  plot_spectra = TRUE,
  verbose = TRUE,
  save = TRUE,
  parallel = TRUE,
  max.only = FALSE,
  output = c("clusters", "pvclust", "both"),
  ...
)

Arguments

`peak_table`	Peak table from `get_peaktable`.
`chrom_list`	A list of chromatograms in matrix form (timepoints x wavelengths).
`peak_no`	Minimum and maximum thresholds for the number of peaks a cluster may have.
`alpha`	Confidence threshold for inclusion of cluster.
`nboot`	Number of bootstrap replicates for `pvclust`.
`plot_dend`	Logical. If TRUE, plots dendrogram with bootstrap values.
`plot_spectra`	Logical. If TRUE, plots overlapping spectra for each cluster.
`verbose`	Logical. If TRUE, prints progress report to console.
`save`	Logical. If TRUE, saves pvclust object to current directory.
`parallel`	Logical. If TRUE, use parallel processing for `pvclust`.
`max.only`	Logical. If TRUE, returns only highest level for nested dendrograms.
`output`	What to return. Either `clusters` to return list of clusters, `pvclust` to return pvclust object, or `both` to return both items.
`...`	Additional arguments to `pvclust`.

Details

A representative spectrum is selected for each peak in the provided peak table and used to construct a distance matrix based on spectral similarity (pearson correlation) between peaks. It is suggested to attach representative spectra to the peak_table using attach_ref_spectra. Otherwise, representative spectra are obtained from the chromatogram with the highest absorbance at lambda max.

Hierarchical clustering with bootstrap resampling is performed on the resulting correlation matrix, as implemented in pvclust. Finally, bootstrap values can be used to select clusters that exceed a certain confidence threshold as defined by alpha. Clusters can also be filtered by the minimum and maximum size of the cluster using the argument peak_no. If max_only is TRUE, only the largest cluster in a nested dendrogram of clusters meeting the confidence threshold will be returned.

Value

Returns clusters and/or pvclust object according to the value of the output argument.

If output = clusters, returns a list of S4 cluster objects.
If output = pvclust, returns a pvclust object.
If output = both, returns a nested list containing [[1]] the pvclust object, and [[2]] the list of S4 cluster objects.

The cluster objects consist of the following components:

peaks: a character vector containing the names of all peaks contained in the given cluster.
pval: a numeric vector of length 1 containing the bootstrap p-value (au) for the given cluster.

Note

Users should be aware that the clustering algorithm will often return nested clusters. Thus, an individual peak could appear in more than one cluster.
It is highly suggested to use more than 100 bootstraps if you run the clustering algorithm on real data even though we use nboot = 100 in the example to reduce runtime. The authors of pvclust suggest nboot = 10000.

Author(s)

Ethan Bass

References

R. Suzuki & H. Shimodaira. 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12):1540-1542. doi: 10.1093/bioinformatics/btl117.

Examples


data(pk_tab)
data(Sa_warp)
cl <- cluster_spectra(pk_tab, nboot=100, max.only = FALSE, save = FALSE, alpha = .97)

chromatographR documentation built on Aug. 24, 2022, 9:06 a.m.

chromatographR index

Package overview README.md chromatographR

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

chromatographR
Chromatographic Data Analysis Toolset

cluster_spectra: Cluster peaks by spectral similarity.
In chromatographR: Chromatographic Data Analysis Toolset

Cluster peaks by spectral similarity.

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to cluster_spectra in chromatographR...

R Package Documentation

Browse R Packages

We want your feedback!

chromatographR Chromatographic Data Analysis Toolset

cluster_spectra: Cluster peaks by spectral similarity. In chromatographR: Chromatographic Data Analysis Toolset

Cluster peaks by spectral similarity.

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to cluster_spectra in chromatographR...

R Package Documentation

Browse R Packages

We want your feedback!

chromatographR
Chromatographic Data Analysis Toolset

cluster_spectra: Cluster peaks by spectral similarity.
In chromatographR: Chromatographic Data Analysis Toolset