knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

MetaboSet utility vignette

This vignette provides an introduction to the MetaboSet class, along with a summary of many of the functions for accessing elements of MetaboSet objects and other utility functions.

Introduction to MetaboSets

MetaboSet objects are the primary data structure of this package. MetaboSet is built upon the ExpressionSet class from the Biobase package by Bioconductor. ExpressionSet is used to record gene expression data, but the structure of the class is easily adaptable to LC-MS. For more information, read the ExpressionSet documentation. MetaboSet objects consist of four main parts, each a matrix or a data frame:

In addition to these, a MetaboSet can store the names of special columns in pData that store group labels, time points or subject identifiers. These columns are used as defaults in many of the functions of the package.

Let's look at the four main parts in more detail:

Pheno data / Sample information

The sample information data frame, or pData has many special column names that are created when data is read from the Excel spreadsheet.

In addition to these three columns, pheno data often holds at least one of the group, time and subject ID columns that are defined separately. They are used as defaults by many functions for visualization and quality control.

Feature data

The feature data part usually has many columns that are created by the peak picking software, but for the sake of this package, the most important are:

Abundances

Naturally, the abundance part, exprs, is used by almost all the functions as it actually holds the data. Not much more to say here.

Results

Many functions use results data frame to record information such as quality metrics and results from statistical tests. One column of results is especially important: Flag column is used to flag features that are deemed low-quality for some reason (see ?flag_detection and ?flag_quality). Many functions have an all_features that controls whether all features or only the good quality features should be used for the function. By default, all_features is always set to FALSE, which means that all flagged features (features with a non-NA value in the Flag column) are ignored.

How to make MetaboSets?

Reading data from Excel spreadsheets
knitr::include_graphics("Data_input.png")

To construct a MetaboSet object, you need to have the data read in R. This can be achieved with read_from_excel function, which reads Excel spreadsheets in the format shown in the figure above. The first parameters include the file name, sheet number, and coordinates for the corner ("Ion Mode" in the above example), in which the three parts of the dataset come together. The row must be numeric, but the column can be given either as a number or a letter (or a combination of two letters), as that is how it's displayed in Excel.

Some fields in sample information and feature data have special purposes.

There are a few obligatory fields:

Additionally, there are a few special cases:

The function returns a list holding the three parts of the data:

Construction of MetaboSet objects

MetaboSet objects are constructed with the construct_MetaboSet function. The functions parameters include all the main parts of a MetaboSet object except results, since a fresh object is initialized with a results data frame with only Feature_ID column and NA flag for each feature. The special column names can also be set for this function. Note that the function returns a named list of MetaboSet objects, where the feature data and abundances are split by the Split column in feature data (most commonly this means the four modes are returned separately). The sample information and special column names are identical for each object.

Inherited from ExpressionSet

These functions from Biobase might be useful:

In addition, MetaboSets can be subset using the syntax for ExpressionSets, namely:

MetaboSet-specific

Utility functions for MetaboSet in particular:

In addition, there are many functions that modify MetaboSets:



antonvsdata/amp documentation built on Jan. 8, 2020, 3:15 a.m.