match_spec: Identify and filter spectra
In OpenSpecy: Analyze, Process, Identify, and Share Raman and (FT)IR Spectra

cor_spec

R Documentation

Identify and filter spectra

Description

match_spec() joins two OpenSpecy objects and their metadata based on similarity. cor_spec() correlates two OpenSpecy objects, typically one with knowns and one with unknowns. ident_spec() retrieves the top match values from a correlation matrix and formats them with metadata. get_metadata() retrieves metadata from OpenSpecy objects. max_cor_named() formats the top correlation values from a correlation matrix as a named vector. filter_spec() filters an Open Specy object. fill_spec() adds filler values to an OpenSpecy object where it doesn't have intensities. os_similarity() EXPERIMENTAL, returns a single similarity metric between two OpenSpecy objects based on the method used.

Usage

cor_spec(x, ...)

## Default S3 method:
cor_spec(x, ...)

## S3 method for class 'OpenSpecy'
cor_spec(x, library, na.rm = T, conform = F, type = "roll", ...)

match_spec(x, ...)

## Default S3 method:
match_spec(x, ...)

## S3 method for class 'OpenSpecy'
match_spec(
  x,
  library,
  na.rm = T,
  conform = F,
  type = "roll",
  top_n = NULL,
  order = NULL,
  add_library_metadata = NULL,
  add_object_metadata = NULL,
  fill = NULL,
  ...
)

ident_spec(
  cor_matrix,
  x,
  library,
  top_n = NULL,
  add_library_metadata = NULL,
  add_object_metadata = NULL,
  ...
)

get_metadata(x, ...)

## Default S3 method:
get_metadata(x, ...)

## S3 method for class 'OpenSpecy'
get_metadata(x, logic, rm_empty = TRUE, ...)

max_cor_named(cor_matrix, na.rm = T)

filter_spec(x, ...)

## Default S3 method:
filter_spec(x, ...)

## S3 method for class 'OpenSpecy'
filter_spec(x, logic, ...)

ai_classify(x, ...)

## Default S3 method:
ai_classify(x, ...)

## S3 method for class 'OpenSpecy'
ai_classify(x, library, fill = NULL, ...)

fill_spec(x, ...)

## Default S3 method:
fill_spec(x, ...)

## S3 method for class 'OpenSpecy'
fill_spec(x, fill, ...)

os_similarity(x, ...)

## Default S3 method:
os_similarity(x, ...)

## S3 method for class 'OpenSpecy'
os_similarity(x, y, method = "hamming", na.rm = T, ...)

Arguments

`x`	an `OpenSpecy` object, typically with unknowns.
`library`	an `OpenSpecy` or `glmnet` object representing the reference library of spectra or model to use in identification.
`na.rm`	logical; indicating whether missing values should be removed when calculating correlations. Default is `TRUE`.
`conform`	Whether to conform the spectra to the library wavenumbers or not.
`type`	the type of conformation to make returned by `conform_spec()`
`top_n`	integer; specifying the number of top matches to return. If `NULL` (default), all matches will be returned.
`order`	an `OpenSpecy` used for sorting, ideally the unprocessed one; `NULL` skips sorting.
`add_library_metadata`	name of a column in the library metadata to be joined; `NULL` if you don't want to join.
`add_object_metadata`	name of a column in the object metadata to be joined; `NULL` if you don't want to join.
`fill`	an `OpenSpecy` object with a single spectrum to be used to fill missing values for alignment with the AI classification.
`cor_matrix`	a correlation matrix for object and library, can be returned by `cor_spec()`
`logic`	a logical or numeric vector describing which spectra to keep.
`rm_empty`	logical; whether to remove empty columns in the metadata.
`y`	an `OpenSpecy` object to perform similarity search against x.
`method`	the type of similarity metric to return.
`...`	additional arguments passed `cor()`.

Value

match_spec() and ident_spec() will return a data.table-class() containing correlations between spectra and the library. The table has three columns: object_id, library_id, and match_val. Each row represents a unique pairwise correlation between a spectrum in the object and a spectrum in the library. If top_n is specified, only the top top_n matches for each object spectrum will be returned. If add_library_metadata is is.character, the library metadata will be added to the output. If add_object_metadata is is.character, the object metadata will be added to the output. filter_spec() returns an OpenSpecy object. fill_spec() returns an OpenSpecy object. cor_spec() returns a correlation matrix. get_metadata() returns a data.table-class() with the metadata for columns which have information. os_similarity() returns a single numeric value representing the type of similarity metric requested. 'wavenumber' similarity is based on the proportion of wavenumber values that overlap between the two objects, 'metadata' is the proportion of metadata column names, 'hamming' is something similar to the hamming distance where we discretize all spectra in the OpenSpecy object by wavenumber intensity values and then relate the wavenumber intensity value distributions by mean difference in min-max normalized space. 'pca' tests the distance between the OpenSpecy objects in PCA space using the first 4 component values and calculating the max-range normalized distance between the mean components. The first two metrics are pretty straightforward and definitely ready to go, the 'hamming' and 'pca' metrics are pretty experimental but appear to be working under our current test cases.

Author(s)

Win Cowger, Zacharias Steinmetz

Examples

data("test_lib")

unknown <- read_extdata("ftir_ldpe_soil.asp") |>
  read_any() |>
  conform_spec(range = test_lib$wavenumber,
               res = spec_res(test_lib)) |>
  process_spec()
cor_spec(unknown, test_lib)

match_spec(unknown, test_lib, add_library_metadata = "sample_name",
           top_n = 1)

OpenSpecy documentation built on June 8, 2025, 10:11 a.m.