loadInputData: Load the input data

View source: R/MultiLayerFunctions.R

loadInputDataR Documentation

Load the input data

Description

The function loadInputData loads the input data, i.e., the files needed to build the networks

Usage

loadInputData(
  peakListF,
  intCol = 23,
  transF = NULL,
  spectraF = NULL,
  gsmnF,
  spectraSS = NULL,
  resPath,
  met2NetDir,
  configF,
  idenMetF,
  metF,
  cleanMetF = TRUE
)

Arguments

peakListF

Path, path to the TSV file that contains the peak list (one peak per row) in a MetaboLights-like format. The first row is the header (i.e., the list of columns' names). The file should contain at least 4 columns: "database_identifier" (i.e., ChEBI), "metabolite_identification" (i.e., metabolite name, if identified), "mass_to_charge", and "retention_time". It is to note that these column names are fixed. The TSV file can also contain intensity (or abundance) values. In this case, the column names are free (e.g., can be named after the samples), but no blank spaces are allowed and the first character must be a letter. In addition, all the abundances must be placed at the end of the table, i.e., in the last columns. NOTE. Special characters, such as single quotes ("'"), and hashtag ("#"), are forbidden and the peak list file should not contain any of them

intCol

numeric, number of the first column containing the intensities or of the subset of columns to take the intensity values from. The default value is 23, as in all datasets available in MetaboLights (https://www.ebi.ac.uk/metabolights)

transF

path, (optional) path to the CSV file containing the transformation list. It must contain at least the columns: "name", "formula", and "mass". This file is needed if the mass difference network is to be built

spectraF

path, (optional) path to the MSMS data in MGF format, if available

gsmnF

path, path to the compound graph in GML format, as generated by Met4J (https://forgemia.inra.fr/metexplore/met4j). The compound graphs of some organisms are available in the data folder

spectraSS

numeric, (optional) samples the spectra dataset, according to the given sample size. This would speed up the process and it is useful when testing your data and/or this package

resPath

path, path to the folder where the results will be stored

met2NetDir

path, path to the directory where the Python package Metabolomics2Network is stored

configF

path, path to the file that contains the column names (i.e., alias) of the idenMetF file and the information they contain, such as name, chebi, formula, etc. See an example in extdata/MTBLS1586/Metabolomics2NetworksData/JsonConf.txt. NOTE. This information will be used to do the metabolite mapping to the GSMN (using metabolomics2Network)

idenMetF

path, path to the TSV file that contains the list of experimental features that were identified and that have a CHEBI id associated. The header of this file must match the aliases from the configuration file (configF)

metF

path, path to the TSV file containing the list of metabolites in the GSMN of the organism of interest. They must contain at least two columns: ID and Chebi. It is important that all the metabolites have a chebi ID, that there is a single chebi ID per metabolite, and that it corresponds to the main chebi ID. If you are not sure that your list of chebi IDs fits these requirement, set the cleanMetF parameter to TRUE. An example of a correct list of metabolites can be found in extdata/MTBLS1586/Metabolomics2NetworksData/WormJamMetWithMasses.tsv

cleanMetF

boolean, set it to TRUE if you want to clean your metabolites' file, as previously described

Value

Named list containing all the data (peakList, spectra, transformations, and gsmn)

Author(s)

Elva Novoa, elva-maria.novoa-del-toro@inrae.fr

Examples

# See the MultiLayerNetwork vignette


MetClassNet/MetClassNetR documentation built on June 30, 2023, 2:12 p.m.