run_cfm: Perform identification using CFM-ID

View source: R/run_cfm.R

run_cfmR Documentation

Perform identification using CFM-ID

Description

This function allows you to perform identification using CFM-ID Please note that you need to have CFM-ID, e.g using install_tools only on Linux or IOS. For Windows see the details

Usage

run_cfm(
  parameter_zip_files = NA,
  local_database = NA,
  ncores = 1,
  progress_bar = T,
  verbose = F,
  conda = "auto",
  env = "metaboraid_package_cfm",
  results_folder = NA,
  max_number_of_compound = 100,
  chech_file_interval = 2,
  total_check_time = 20,
  timeout = 600,
  cfm_bin = NA
)

Arguments

parameter_zip_files

A zip file containing MS2 parameter generated by map_adducts

local_database

absolute path to the local database (CSV). See the details

ncores

Number of cores to use for parallel processing. Default 1

progress_bar

Whether to show progress bar or not. Default FALSE

verbose

Show information about different stages of the processes. Default FALSE

conda

Conda binary to use. Default auto

env

conda environment used to run the process. Default metaboraid_package

results_folder

a path to a folder where results for EACH ion will be returned

max_number_of_compound

Maximum number of compounds to report for each ion. Default 100

chech_file_interval

not used

total_check_time

not used

timeout

The maximum number of seconds to wait for a single out to give a result. Default 600

cfm_bin

An absolute path to the a cfm-id binary. Used on Windows. For other platform use conda

database

Name of the database to use: Only one of KEGG, PubChem, ChemSpider, and LocalCSV!

Details

We install CFM-ID in metaboraid_package environment.

The local CSV file must contain the metabolites you wish to perform identification on. this file must contain the following columns: "Identifier", "MonoisotopicMass" ,"MolecularFormula" ,"SMILES" , "InChI" , "InChIKey1" , "InChIKey2" , "InChIKey3" , "Name" , "InChIKey". An example of such dataset for HMDB can be found here: https://github.com/metaboraid/test-datasets/blob/master/hmdb_2017-07-23.csv

If you are running on Windows, you need to install CFM-ID yourself (see https://cfmid.wishartlab.com/) After installation, find the aboslute path for the cfm-id binary file (cfm-id.exe) and set cfm_bin to the path.

Value

A dataframe containing the identified ions. The dataframe contains search engine and database specific information but also tree important columns: parentMZ, parentRT, fileName which are used to trace the ions by the downstream processes.

Examples


library(CAMERA)
library(metaboraid)
# Read MS1 and MS2 files
ms1_files<-system.file("ms1data",c("X1_Rep1.mzML","X2_Rep1.mzML"),package = "metaboraid")
ms2_files<-system.file("ms2data",c("sample1.mzML","sample2.mzML"),package = "metaboraid")

# mass trace detection
xs <- xcmsSet(ms1_files,method="centWave",ppm=30,peakwidth=c(5,10))

# mass trace matching
xsg <- group(xs)

# convert to CAMERA
xsa <- xsAnnotate(xsg)

# Group mass traces
anF <- groupFWHM(xsa, perfwhm = 0.6)

# Detect isotopes
anI <- findIsotopes(anF, mzabs = 0.01)

# Group using correlation
anIC <- groupCorr(anI, cor_eic_th = 0.75)

# Find adducts
anFA <- findAdducts(anIC, polarity="positive")

# map features and MS2s
mapped_features<-map_features(inputMS2s = ms2_files,input_camera = anFA,ppm = 10,rt = 10)

# Map adducts
mapped_adducts<-map_adducts(inputMS2List=mapped_features,input_camera=anFA,
                            precursorppm=10,
                            fragmentppm=20,fragmentabs=0.01,minPrecursorMass=NA,maxPrecursorMass=NA,
                            minPeaks=10,maxSpectra=10,mode="pos",adductRules="primary",
                            outputDir="general_parameters_4",searchMultipleChargeAdducts=T,
                            includeMapped=T,includeUnmapped=F,verbose=T)

# Run the search

run_cfm("parameter_files.zip",database = "KEGG",ncores = 2,progress_bar = F,verbose = T,results_folder = "pp",chech_file_interval = 2,timeout = 600,conda = "auto",max_number_of_compound=10)


metaboraid/metaboraid documentation built on Jan. 29, 2023, 1:41 a.m.