knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(chromatographR) data("Sa_warp")
pks <- get_peaks(Sa_warp,lambdas=210) pktab <- get_peaktable(pks)
pktab <- attach_ref_spectra(pktab, ref = "max.cor")
After assigning reference spectra to our peak table, we can use pearson correlations to easily locate peaks falling within a certain similarity threshold to a reference spectrum. To do this, we must first construct a correlation matrix using the cor
function.
cor_mat <- cor(pktab$ref_spectra)
We can then plot spectra falling within a desired similarity threshold (t
) to a reference peak (in this case 'V21').
par(mfrow=c(2,2)) idx_90 <- which(cor_mat[,"V21"] > .9) matplot(pktab$ref_spectra[,idx_90],type='l', ylab="Abs") legend("top", "t = 0.90", bty = "n") idx_95 <- which(cor_mat[,"V21"] > .95) matplot(pktab$ref_spectra[,idx_95],type='l', ylab="Abs") legend("top", "t = 0.95", bty = "n") idx_97 <- which(cor_mat[,"V21"] > .97) matplot(pktab$ref_spectra[,idx_97],type='l', ylab="Abs") legend("top", "t = 0.97", bty = "n") idx_99 <- which(cor_mat[,"V21"] > .99) matplot(pktab$ref_spectra[,idx_99],type='l', ylab="Abs") legend("top", "t = 0.99", bty = "n")
We can also extract retention times for spectra above the desired spectral similarity threshold with the reference spectra.
rts_99 = pktab$pk_meta["rt", idx_99] rts_99
Below we plot traces of our four chromatograms at 210 nm. The r length(idx_99)
peaks that match the reference spectrum at the 99% level are marked with solid black lines and the r length(idx_97)
peaks that match the reference spectrum at the 97% level are marked with dotted gray lines.
par(mfrow = c(1, 1)) plot_chroms(Sa_warp, lambdas=210) abline(v = rts_99) abline(v = pktab$pk_meta["rt", idx_97], lty=2, col = "darkgray")
If we are interested in the combined area of these peaks, we could also use these indices to sum them together.
apply(pktab$tab[,idx_99], 1, sum)
chromatographR
also includes a function (cluster_spectra
) to group spectra using hierarchical clustering. We can set the seed with the iseed
argument to ensure repeatability among instantiations of this vignette. For simplicity, we use the argument max.only = TRUE
to return only the largest clusters.
par(mfrow=c(2,2)) cl <- cluster_spectra(pktab, alpha = 0.01, iseed = 1, max.only = TRUE)
The output is a dendrogram of peaks annotated with bootstrap values showing the support for each node and spectra of the clusters with p-values below the significance level specified by alpha
. In this case, we recovered two clusters, representing two different classes of compounds distinguished by their chromophores. The second cluster closely matches the spectrum represented by peak 'V21'
. We can access the peaks in cluster two as follows:
cl2 <- cl$clusters$c2@peaks cl2
In this case, the peaks in cluster two happen to overlap completely with the peaks recovered through pearson correlation at the 90% threshold.
all.equal(cl2, names(idx_90))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.