knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(ftirsr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pls)

The ftisr package provides easy access and user-friendly methods for researchers analyzing Biogenic Silica (BSi). (BSi) is used as a proxy for past temperatures in High Arctic settings. Typically, greater amounts of BSi in sediment cores indicate warmer temperatures. Recently paleoclimatologists have begun to use Fourier Transform Infrared (FTIR) spectroscopy to collect information on BSi. However, the offloaded relative absorbency data makes comparison with other proxies difficult. This project is intended to develop an R package to facilitate analysis of multivariate-length samples using a model and observations in collaboration with the Smith College PLSR SDS Capstone group.

Motivation

We draw motivation from this previous project. Although packages that contain methods for PLSR already exist in the R universe, none provide the context and specific methods necessary for the implementation of such methods on a calibrated model specifically based on predicting BSi percentages.

Installation

Get the development version from GitHub:

# If you haven't installed the remotes package yet, do so:
# install.packages("remotes")
remotes::install_github("sds270-s22/ftirsr")
# Load package
library(ftirsr)

Necessary Packages:

In order to install the ftirsr package, you will need the following R packages:

Data

Included in this package are two datasets. Both were sourced from the referenced Capstone Group.

The end-user format is tidy (long), with 4 variables:

Although these datasets are presented in tidy-format, we leave the option available for a non-tidy format (wide) data set. Both of these datasets may be referred to as Wet Chemistry data.

Who should use this package?

Anyone who is interested in exploring geology via spectroscopy and BSi. The methods documented within this package are intended to ease the technical aspects of this field.

Objects:

Methods:

Functions:

Examples:

How can we predict the amount of Biogenic Silica within organic samples given a series of data obtained from FTIR spectroscopy?

Step 1: Read the data

In this example, we are interpolating the samples onto the vector of rounded wavenumbers used in our model.

my_data <- read_ftirs(dir_path = "samples")

head(my_data)

But, we don't have to interpolate! The default is interpolate = TRUE, but we can set it to interpolate = FALSE.

# This shows how to read a directory without interpolation
# Note the difference in wavenumbers
my_data_no_interp <- read_ftirs(dir_path = "samples",
                      interpolate = FALSE)

head(my_data_no_interp)

We can include Wet Chemistry data in our ftirs dataframe. This is necessary to use this data to train a PLS model. (Note: to predict, we don't want to include BSi in our ftirs dataframe).

# All we have to do is add a path to the file containing the Wet Chemistry data
my_data_wet_chem <- read_ftirs(dir_path = "samples", 
                               wet_chem_path = "wet-chem-data.csv")

It is also possible to read a single sample file using read_ftirs_file().

one_sample <- read_ftirs_file(single_filepath = "samples/FISK-10.0.csv")

head(one_sample)

Step 2: Predict data using trained model

# Data must be in the ftirs wide format in order to use predict method
my_data_wide <- my_data %>% 
  pivot_wider()

head(my_data_wide[1:8])
# We could get this same result by calling format = "wide" while reading in the samples  
my_data_wide <- read_ftirs(dir_path = "samples",
                      format = "wide") 

head(my_data_wide[1:8])
# Data must have the ftirs class to call the correct predict.ftirs() method
is.ftirs(my_data_wide)
# Call predict
preds <- predict(my_data_wide)

Step 3: Obtain results

preds
# We can specify the number of components we want to see, as this is inherited from the predict.mvr method

# Here we are predicting the same values as above, but choosing only 4 components
preds <- predict(my_data_wide, ncomp = 4)
preds
# If we want to see the details of the training model, we can call arctic_mod() 
mod <- arctic_mod()
summary(mod)
ggplot(preds[1,], aes(x = FISK.10, y = bsi))

The functionality demonstrated above are just a few common examples to help a user get started using our package.



sds270-s22/ftirsr documentation built on June 24, 2022, 12:56 p.m.