generateLOBdbase: Conduct _in silico_ simulation and generate lipid-oxylipin...

View source: R/generateLOBdbase.R

generateLOBdbaseR Documentation

Conduct in silico simulation and generate lipid-oxylipin database

Description

Applies an in silico simulation to generate data by ionization mode (polarity) for a wide range of lipids, oxidized lipids, and oxylipins. User-supplied structural criteria and empirically-determined adduct ion abundance rankings for the major lipid classes are used to create entries for a range of lipid moieties. The database(s) can then be used in doLOBscreen to assign compound identities to grouped peakdata.

Usage

generateLOBdbase(polarity = c("positive","negative"), gen.csv = FALSE,
                 component.defs = NULL, AIH.defs = NULL, acyl.ranges = NULL,
                 oxy.ranges = NULL)

Arguments

polarity

Ionization mode for which database is to be generated.

gen.csv

Should results also be written to a .csv file?

component.defs

File path to a .csv file containing elemental composition definitions for the various chemical components needed by generateLOBdbase. If nothing is specified, generateLOBdbase will use the default composition table (default.componentCompTable). The default table includes definitions for the masses of a wide range of adducts, photosynthetic pigments, and structural backbones of some major lipid classes.

A Microsoft Excel spreadsheet template included with the package at Resources/library/LOBSTAHS/doc/xlsx/LOBSTAHS_componentCompTable.xlsx can be used to generate a .custom csv file with elemental composition definitions in a format appropriate for generateLOBdbase. Alternatively, the spreadsheet may be downloaded from the package GitHub repository. Brief instructions for customization of the table are given in this help document; full instructions, including details on specification of the necessary base fragment, are contained in the package vignette.

For each lipid class or compound specified in the component definitions table, the field DB_gen_compound_type must contain one of five values: "DB_acyl_iteration," "DB_unique_species," "basic_component," "adduct_pos," or "adduct_neg." The last three compound types are reserved for definition of basic components such as acteonitrile or acetate and for definition of adduct ion types; new entries of these types should only be created in the compound table when a new adduct or basic component must be specified. The first two compound types are used to define the way generateLOBdbase creates its databases. There are essentially two ways generateLOBdbase creates database entries in LOBSTAHS.

For compounds of DB_gen_compound_type = "DB_unique_species" (the simpler of the two cases), database entries will be created only for adduct ions of the single compound specified. The latter type should be used for pigments and other lipids that do not have acyl groups, or when the user does not wish to consider any possible variation in acyl properties. In this case, the exact mass of the complete (neutral) molecule should be specified in the component definitions (i.e., component composition) table.

Alternatively, for compounds of DB_gen_compound_type = "DB_acyl_iteration", generateLOBdbase will create database entries for adduct ions of multiple molecular species within the lipid class based on the ranges of acyl properties and oxidation states given for the class in acyl.ranges and oxy.ranges (see below). In this case, the compound table should be used to define the exact mass of a "base fragment" for the lipid class. Using this "base fragment" as a starting point, generateLOBdbase creates multiple entries for molecules in the lipid class by iterative addition of various combinations of fatty acids. In the case of IP-DAG and IP-MAG, the base fragment includes the entire polar headgroup, the glycerol backbone, and both carboxylic oxygen atoms in the fatty acid(s). In the case of TAG, the base fragment is defined as the glycerol backbone plus the carboxylic oxygen atoms on each of the three fatty acids. The base fragments for any new lipid classes for which the user desires evaluation of a range of acyl properties should be similarly defined.

Note that regardless of the DB_gen_compound_type, an adduct hierarchy must be specified in the adduct ion hierarchy matrix (see below) for each compound or compound class specified in the Adduct_hierarchy_lookup_class field of the component definitions table.

AIH.defs

File path to a .csv file containing empirical adduct ion hierarchy (AIH) data for various pigments, lipids, and lipid classes. If nothing is specified, generateLOBdbase will use the default AIH data (default.adductHierarchies). Each compound or compound class for which there is an entry in the AIH definitions table should have at least one corresponding entry in the Adduct_hierarchy_lookup_class field of the component definitions table (default, default.componentCompTable).

A Microsoft Excel spreadsheet template included with the package at Resources/library/LOBSTAHS/doc/xlsx/LOBSTAHS_adductHierarchies.xlsx can be used to generate a .csv file with additional (or alternative) adduct hierarchy data in a format appropriate for generateLOBdbase. Alternatively, the spreadsheet may be downloaded from the package GitHub repository.

acyl.ranges

File path to a .csv file containing ranges of values for the total number of acyl (i.e., fatty acid) carbon atoms to be considered during the in silico simulation of any lipid classses for which DB_gen_compound_type is specified as "DB_acyl_iteration" in the component definitions table, above. These include intact polar diacylglycerols (IP-DAG), triacylglycerols (TAG), polyunsaturated aldehydes (PUAs), and free fatty acids (FFA). If nothing is specified, generateLOBdbase will use the default acyl carbon atom range data in (default.acylRanges).

A Microsoft Excel spreadsheet template included with the package at Resources/library/LOBSTAHS/doc/xlsx/LOBSTAHS_acylRanges.xlsx can be used to generate a .csv file with custom acyl carbon range data in a format appropriate for generateLOBdbase. Alternatively, the spreadsheet may be downloaded from the package GitHub repository.

oxy.ranges

File path to a .csv file containing ranges of values for the number of additional oxygen atoms to be considered during the in silico simulation of any lipid classses for which DB_gen_compound_type is specified as "DB_acyl_iteration" in the component definitions table, above. If nothing is specified, generateLOBdbase will use the default oxidation state ranges in (default.oxyRanges).

A Microsoft Excel spreadsheet template included with the package at Resources/library/LOBSTAHS/doc/xlsx/LOBSTAHS_oxyRanges.xlsx can be used to generate a .csv file with custom oxidation state ranges in a format appropriate for generateLOBdbase. Alternatively, the spreadsheet may be downloaded from the package GitHub repository. By default, generateLOBdbase considers 0-4 additional oxygen atoms on each chemically possible IP-DAG, TAG, PUA, and FFA.

Details

Using the default structural property inputs described here, generateLOBdbase can produce databases with entries for a wide range of intact polar diacylglycerols (IP-DAG), triacylglycerols (TAG), polyunsaturated aldehydes (PUAs), free fatty acids (FFA), and common photosynthetic pigments. The default databases (as of January 2017) contain data on 18,067 and 15,404 unique compounds that can be identifed in positive and negative ion mode spectra, respectively.

Note that the default databases have been pre-generated (see default.LOBdbase) and it is therefore unnecessary to call generateLOBdbase with the default parameters.

Value

A "LOBdbase-class" object with the structure:

frag_ID:

Object of class "integer", a unique identifier for this molecular species

mz:

Object of class "numeric", the calculated m/z of this species

exact_parent_neutral_mass:

Object of class "numeric", the calculated (monoisotopic) exact mass of the parent compound of this species

lipid_class:

Object of class "factor", the parent lipid class of this species

species:

Object of class "character", the lipid subclass

adduct:

Object of class "factor", the adduct ion represented by this entry

adduct_rank:

Object of class "integer", the relative abundance ranking of this adduct relative to the other adducts of the same parent compound

FA_total_no_C:

Object of class "integer", total number of acyl (fatty acid) carbon atoms in the parent compound; NA if lipid_class is not TAG, IP-DAG, PUA, or FFA

FA_total_no_DB:

Object of class "integer", total number of acyl (fatty acid) carbon-carbon double bonds in the parent compound; NA if lipid_class is not TAG, IP-DAG, PUA, or FFA

degree_oxidation:

Object of class "integer", number of additional oxygen atoms present

parent_elem_formula:

Object of class "character", elemental formula of the parent compound

parent_compound_name:

Object of class "character", name of the parent compound; see the reference for this entry for the naming convention applied to compounds other than pigments

polarity:

Object of class "factor", ionization mode of data in the database

num_entries:

Object of class "integer", number of total entries (adducts) in the database

num_compounds:

Object of class "integer", number of parent compounds represented in the database (should be < num_entries)

Author(s)

James Collins, james.r.collins@aya.yale.edu

References

The LOBSTAHS package is presented in:

Collins, J.R., B.R. Edwards, H.F. Fredricks, and B.A.S. Van Mooy. 2016. LOBSTAHS: An adduct-based lipidomics strategy for discovery and identification of oxidative stress biomarkers. Analytical Chemistry 88:7154-7162, doi:10.1021/acs.analchem.6b01260.

Data for lipid classes BLL, PDPT, vGSL, sGSL, hGSL, hapGSL, and hapCER are as described in:

Hunter J. E., M. J. Frada, H. F. Fredricks, A. Vardi, and B. A. S. Van Mooy. 2015. Targeted and untargeted lipidomics of Emiliania huxleyi viral infection and life cycle phases highlights molecular biomarkers of infection, susceptibility, and ploidy. Frontiers in Marine Science 2:81, doi:10.3389/fmars.2015.00081

Fulton, J. M., H. F. Fredricks, K. D. Bidle, A. Vardi, B. J. Kendrick, G. R. DiTullio, and B. A. S. Van Mooy. 2014. Novel molecular determinants of viral susceptibility and resistance in the lipidome of Emiliania huxleyi, Environmental Microbiology 16(4):1137-1149, doi:10.1111/1462-2920.12358.

See Also

LOBdbase, LOBdbase, loadLOBdbase, doLOBscreen, default.LOBdbase, default.componentCompTable, default.adductHierarchies, default.acylRanges, default.oxyRanges

Examples

## generate the default positive ionization mode database


LOBdbase.pos = generateLOBdbase(polarity = "positive", gen.csv = FALSE,
                                component.defs = NULL, AIH.defs = NULL,
                                acyl.ranges = NULL, oxy.ranges = NULL)


vanmooylipidomics/LOBSTAHS documentation built on Oct. 30, 2022, 7:13 p.m.