In sds270-s22/ftirsr: Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(ftirsr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pls)

The ftisr package provides easy access and user-friendly methods for researchers analyzing Biogenic Silica (BSi). (BSi) is used as a proxy for past temperatures in High Arctic settings. Typically, greater amounts of BSi in sediment cores indicate warmer temperatures. Recently paleoclimatologists have begun to use Fourier Transform Infrared (FTIR) spectroscopy to collect information on BSi. However, the offloaded relative absorbency data makes comparison with other proxies difficult. This project is intended to develop an R package to facilitate analysis of multivariate-length samples using a model and observations in collaboration with the Smith College PLSR SDS Capstone group.

Motivation

We draw motivation from this previous project. Although packages that contain methods for PLSR already exist in the R universe, none provide the context and specific methods necessary for the implementation of such methods on a calibrated model specifically based on predicting BSi percentages.

Installation

Get the development version from GitHub:

# If you haven't installed the remotes package yet, do so:
# install.packages("remotes")
remotes::install_github("sds270-s22/ftirsr")

# Load package
library(ftirsr)

Necessary Packages:

In order to install the ftirsr package, you will need the following R packages:

dplyr
magrittr
readr
purrr
janitor
fs
tidyr
tibble
pls
knitr
ggplot2

Data

Included in this package are two datasets. Both were sourced from the referenced Capstone Group.

alaska.csv: A dataset containing absorbance spectra for 103 Alaska lake sediment core samples across 1882 wavenumbers.
greenland.csv: A dataset containing absorbance spectra for 28 Greenland lake sediment core samples across 1882 wavenumbers.

The end-user format is tidy (long), with 4 variables:

sample_id: Core Sample ID referring to location and content of sample
bsi: Percentage of Biogenic Silica in Sample, determined by a wet chemical process
wavenumber: Wavenumber value (units = reciprocal centimeters)
absorbance: Absorbance levels

Although these datasets are presented in tidy-format, we leave the option available for a non-tidy format (wide) data set. Both of these datasets may be referred to as Wet Chemistry data.

Who should use this package?

Anyone who is interested in exploring geology via spectroscopy and BSi. The methods documented within this package are intended to ease the technical aspects of this field.

Objects:

ftirs: The ftirs class attribute allows users to designate objects within their code as ftirs objects. An ftirs object is a modified dataframe that is properly formatted such that the dataframe may be used in a PLSR model.

Methods:

read_ftirs: A method that generates a tidy data frame binding multiple FTIRS samples together. This function takes in a directory file path, reads in each file within the directory, and returns an ftirs object.
read_wet_chem: A method that reads and attaches Wet Chemistry data to the FTIRS object. This function is called in read_ftirs() via the optional wet_chem_path argument. This method takes in an optional filepath to singular Wet Chemistry Data file to be included in the FTIRS dataframe and the corresponding FTIRS dataframe to have the Wet Chemistry Data attached to. It outputs the modified dataframe.
pivot_longer.ftirs: A method that pivots a wide, non-tidy FTIRS dataframe to a long, tidy format. This method takes in an ftirs object that is a wide, non-tidy FTIRS dataframe.
pivot_wider.ftirs: A method that pivots the FTIRS dataframe to wide, non-tidy format, necessary for input into a PLSR model. This method takes a long-format ftirs object and returns a wide, non-tidy formatted ftirs object.
is.ftirs: A method that checks if an object is an ftirs object.
as.ftirs: A method that will attach the ftirs attribute to an object.
predict.ftirs: A method that will predict BSi content based on the contained ftirs model with inputted data.
interpolate_ftirs: A method to interpolate an fitrs dataset. This method takes in a series of vectors, wavenumber, absorbance, and interpolate_vec and returns a bound dataframe of these interpolated vectors.
arctic_mod: A function that returns the PLSR model used by predict.ftirs() to predict BSi percentages from testing data.

Functions:

read_ftirs_file: A function that generates a tibble from a single FTIRS sample. This function takes in a single file path as an input and returns an ftirs object. This is a function meant to primarily be used only by developers.
read_wet_chem: A function that reads and attaches Wet Chemistry data to the FTIRS object.

Examples:

How can we predict the amount of Biogenic Silica within organic samples given a series of data obtained from FTIR spectroscopy?

Step 1: Read the data

In this example, we are interpolating the samples onto the vector of rounded wavenumbers used in our model.

my_data <- read_ftirs(dir_path = "samples")

head(my_data)

But, we don't have to interpolate! The default is interpolate = TRUE, but we can set it to interpolate = FALSE.

# This shows how to read a directory without interpolation
# Note the difference in wavenumbers
my_data_no_interp <- read_ftirs(dir_path = "samples",
                      interpolate = FALSE)

head(my_data_no_interp)

We can include Wet Chemistry data in our ftirs dataframe. This is necessary to use this data to train a PLS model. (Note: to predict, we don't want to include BSi in our ftirs dataframe).

# All we have to do is add a path to the file containing the Wet Chemistry data
my_data_wet_chem <- read_ftirs(dir_path = "samples", 
                               wet_chem_path = "wet-chem-data.csv")

It is also possible to read a single sample file using read_ftirs_file().

one_sample <- read_ftirs_file(single_filepath = "samples/FISK-10.0.csv")

head(one_sample)

Step 2: Predict data using trained model

# Data must be in the ftirs wide format in order to use predict method
my_data_wide <- my_data %>% 
  pivot_wider()

head(my_data_wide[1:8])

# We could get this same result by calling format = "wide" while reading in the samples  
my_data_wide <- read_ftirs(dir_path = "samples",
                      format = "wide") 

head(my_data_wide[1:8])

# Data must have the ftirs class to call the correct predict.ftirs() method
is.ftirs(my_data_wide)

# Call predict
preds <- predict(my_data_wide)

Step 3: Obtain results

preds

# We can specify the number of components we want to see, as this is inherited from the predict.mvr method

# Here we are predicting the same values as above, but choosing only 4 components
preds <- predict(my_data_wide, ncomp = 4)
preds

# If we want to see the details of the training model, we can call arctic_mod() 
mod <- arctic_mod()
summary(mod)

ggplot(preds[1,], aes(x = FISK.10, y = bsi))

The functionality demonstrated above are just a few common examples to help a user get started using our package.

sds270-s22/ftirsr documentation built on June 24, 2022, 12:56 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sds270-s22/ftirsr
Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model

In sds270-s22/ftirsr: Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model

Motivation

Installation

Get the development version from GitHub:

Necessary Packages:

Data

Who should use this package?

Objects:

Methods:

Functions:

Examples:

Step 1: Read the data

Step 2: Predict data using trained model

Step 3: Obtain results

R Package Documentation

Browse R Packages

We want your feedback!

sds270-s22/ftirsr Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model

In sds270-s22/ftirsr: Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model

Motivation

Installation

Get the development version from GitHub:

Necessary Packages:

Data

Who should use this package?

Objects:

Methods:

Functions:

Examples:

Step 1: Read the data

Step 2: Predict data using trained model

Step 3: Obtain results

R Package Documentation

Browse R Packages

We want your feedback!

sds270-s22/ftirsr
Easily Load FTIR Spectroscopy Data for a Partial Least Squares Regression Model