knitr::opts_chunk$set( echo = TRUE, fig.width = 6, fig.height = 4, warning = FALSE, message = FALSE )
The ftmsRanalysis
package allows users to read in data from CSV files containing high resolution mass spectrometry data after initial processing and formula assignment.
The function read_CoreMS_data
takes one or more CSV files output by the software CoreMS and converts them into a single data.frame
that is compatible with ftmsRanalysis
functions. The function will automatically insert a column containing the file names and output a data.frame
that has been assigned the class CoreMSrbind
. Files are read in by providing a list of file paths, either manually or by pattern matching using a function like list.files
, as in the example below.
library(ftmsRanalysis) library(dplyr) ex_files <- list.files(pattern = "ex_data") dat <- read_CoreMS_data(ex_files)
At this point, it is good practice to check the dataset to verify whether there are isotopic columns present, and if so, which ones. This information will be used in the next step.
dat %>% head() %>% DT::datatable(options = list(dom = 't', scrollX = TRUE), rownames = FALSE)
The next step is to convert the CoreMSrbind
object into a CoreMSData
object using the function as.CoreMSData
. The arguments for this function are the data.frame
output by the function read_CoreMS_data
and, optionally, the names of the columns corresponding to mass-to-error ratio (m/z), peak height, mass error, confidence score, molecular formula, and other peak metadata. The default values for these parameters are set to the column names in typical CoreMS output. The column names corresponding to isotope columns C13, S34, O18, and N15 should be included if and only if present in the dataset. In this case, we will include names for the C13 and O18 columns.
cmsDat <- as.CoreMSData(dat)
The CoreMSData
object consists of a list of two data.frames
, monoiso_data
and iso_data
, containing the monoisotopic peaks and isotopic peaks, respectively. At this time, the isotopic peaks are not used for downstream analyses.
Calling the plot method on a CoreMSData object produces a barplot showing the number of peaks, both monoisotopic and isotopic, with unique masses per file/sample. In a case with many samples or very long sample names, the parameter rotate_x_labs
can be set to TRUE
. This will display the x-axis tick labels vertically for improved readability.
plot(cmsDat)
Confidence filtering is performed using the functions conf_filter
and apply.Filt
. We first construct a confFilter
object using the function conf_filter
. Before applying a filter, the confFilter
object can be plotted to visualize the numbers of monoisotopic peaks retained in the dataset under different confidence filtering thresholds.
cmsDat_filt_obj <- conf_filter(cmsDat) plot(cmsDat_filt_obj)
The mass_error_plot
function provides a quick summary of the dataset as well as another method for previewing the impact of a confidence filter on the dataset before application, using the optional parameter, min_conf
. These plots show only monoisotopic peaks.
plotly::subplot(mass_error_plot(cmsDat), mass_error_plot(cmsDat, min_conf = 0.25), mass_error_plot(cmsDat, min_conf = 0.50), mass_error_plot(cmsDat, min_conf = 0.75), nrows = 2, shareX = TRUE, titleX = TRUE, titleY = TRUE, margin = 0.05)
When plotting a dataset with 1000 or more peaks, the mass_error_plot
function will return a hex plot in which the color indicates the number of points in a region, while a dataset with fewer than 1000 peaks will return a scatter plot with information about individual peaks available by hovering over points.
mass_error_plot(cmsDat, min_conf = 0.95)
Once an appropriate confidence threshold has been selected, apply the confidence filter using the function applyFilt
, specifying the confFilt
object, the CoreMSData
object, and the desired minimum confidence threshold, min_conf
.
cmsDat_filtered <- applyFilt(filter_object = cmsDat_filt_obj, msObj = cmsDat, min_conf = 0.5)
The filtered dataset object contains attributes indicating that a filter has been applied, including the confidence threshold and a list of the mass identifiers of the removed peaks. These values can be accessed using the attr
function.
attr(cmsDat_filtered, "filters")$confFilt$report_text
If an attempt is made to apply a confidence filter to a dataset that has already been filtered by confidence score, an error will be thrown to indicate this.
applyFilt(filter_object = cmsDat_filt_obj, msObj = cmsDat_filtered, min_conf = 0.8)
ftmsRanalysis
functions require that each sample in a dataset contains only unique m/z values and unique molecular formulas. Since CoreMS
outputs several candidate formula assignments for each peak, we must apply the function unique_mf_assignment
to reduce the dataset to unique values. This step can be performed using one of three criteria: maximum confidence score (method = "confidence"
), maximum peak height (method = peak_intensity"
), or prevalence across samples (method = "prevalence"
).
If this function is called on a dataset that has not been adequately confidence filtered, it is likely that ties will be present in the dataset. In this case, an error message will provide a minimum confidence score threshold that must be applied to the dataset before assigning unique molecular formulas.
cmsDat_unique_mf <- unique_mf_assignment(cmsDat, method = "confidence")
cmsDat_unique_mf <- cmsDat %>% applyFilt(filter_object = cmsDat_filt_obj, msObj = ., min_conf = 0.16) %>% unique_mf_assignment(method = "confidence")
Once unique molecular formulas have been assigned to peaks within each sample, we can convert the CoreMSData
object into an ftmsData
object using the function coreMSDataToFtmsData
. This transforms the data table containing monoisotopic peaks into a list of three data tables:
e_data
: Expression dataRemaining columns correspond to samples, values are peak intensities
f_data
: Sample data
Rows correspond to samples, additional columns may be added containing sample metadata
e_meta
: Molecular identification data
ftmsObj <- coreMSDataToFtmsData(cmsDat_unique_mf)
ftmsObj$e_data %>% DT::datatable(rownames = FALSE) ftmsObj$f_data %>% DT::datatable() ftmsObj$e_meta %>% DT::datatable(rownames = FALSE)
For further steps and information, see this introduction to ftmsRanalysis
.x
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.