View source: R/library_generator.r
library_generator | R Documentation |
The function proposes three data processing algorithms to pick up MS1/MS2 scans from DDA or targeted mode LC-MS/MS data, merge them into a spectral library and create a spectral similarity-based molecular network.
library_generator(
input_library = NULL,
lcms_files = NULL,
metadata_file = NULL,
polarity = c("Positive", "Negative")[1],
mslevel = c(1, 2),
add.adduct = TRUE,
adductType = NULL,
processing.algorithm = c("Default", "compMS2Miner", "RMassBank")[1],
params.search = list(mz_search = 0.01, ppm_search = 10, rt_search = 15, rt_gap = 30),
params.ms.preprocessing = list(normalized = TRUE, baseline = 1000, relative = 0.1,
max_peaks = 200, recalibration = 0),
params.consensus = list(consensus = FALSE, consensus_method = c("consensus",
"consensus2", "common_peaks", "most_recent")[1], consensus_window = 0.02),
params.network = list(network = FALSE, similarity_method = "Cosine", min_frag_match =
6, min_score = 0.6, topK = 10, max_comp_size = 100, reaction_type = "Metabolic",
use_reaction = FALSE),
params.user = list(sample_type = "", user_name = "", comments = "")
)
input_library |
Character or a list object. If character, name of the existing library into which new scans are added, the file extension must be mgf, msp or RData; please set to NULL if the new library has no dependency with previous ones. |
lcms_files |
A character vector of LC-MS/MS file names from which scans are extracted. All files must have be in centroid-mode with mzML, mzXML or cdf extension! |
metadata_file |
A single character, NULL object or data frame. If it is character, it should be the metadata file name. The file should be tab, comma or semi-colon separated txt, dat or csv format. For all algorithms, the metadata must contain the column "ID" - a unique structure identifier. The column PEPMASS (targeted precursor mass) must be provided for Default and compMS2Miner. The column RT (targeted retention time in min) must be provided for compMS2Miner and optional for MergeION and RMassBank. Please include the column SMILES (structure identifier) for RMassBank algorithm. If RMassBank is used, the column FILENAME (chromatogram file with mzML, mzXML or cdf extension) must be provided for each compound telling the algorithm from which file compound can be found. Column FILENAME is optional for Default and compMS2Miner. Column ADDUCT is optional for all algorithms, if not provided, all input will be considered as M+H or M-H depending on polarity. Please specify the adduct type if metadata contains both positive and negative ions. If metadata is NULL and lcms files are acquired in DDA mode, an automated feature screening is performed for fragmented masses. Masses and retention times of these features are used for spectral library generation and molecular networking. |
polarity |
A single character. Either "Positive" or "Negative". Ion mode of LC-MS/MS files. |
mslevel |
A numeric vector. 1 or 2 or c(1,2). 2 if MS2 scans are extracted, 1 if isotopic pattern of the precursor mass in the MS1 scan is extracted. c(1,2) if both MS1 and MS2 scans are extracted. Note: High-quality isotopic patterns in MS1 scans are useful for determining precursor formula! |
add.adduct |
Logical. If TRUE, additional adduct types will be calculated based on precursor masses of "M+H" and "M-H" adducts in the input metadata: "M+2H", "M+Na","M+K","M+NH4","M+" will be searched for positive ion mode, "M+COO-", "M+Cl" and "M+CH3COO-" for negative ion mode. If FALSE, no additional adduct types will be searched. |
processing.algorithm |
A single character. "Default", "compMS2Miner" or "RMassBank". |
params.search |
Parameters for searching and collecting ions from chromatogram files in a list. These parameters define the tolerance window when input metadata is searched. The list must contain following elements:
|
params.ms.preprocessing |
Parameters for pre-processing scans found in chromatogram files in a list. It must contain:
|
params.consensus |
Parameters for generating consensus scans that combine spectra of the same compound ID
|
params.network |
Parameters for networking consensus spectra library into a molecular network
|
params.user |
A list of additional parameters.
|
adductType. |
User-specified adduct type, default is NULL. Set 'add.adduct' to TRUE and specify 'adductType' to fiter records limited to 'adductType' before appending the additional adduct types. |
complete: Entire spectra library (historical + newly added records) is a list object of two elements: "library$sp" ~ List of all extracted spectra. Each spectrum is a data matrix with two columns: m/z and intensity; "library$metadata" ~ Data frame containing metadata of extracted scans. PEPMASS and RT are updated based on scans detected in the chromatogram files. Following metadata columns are updated/added: FILENAME (which raw data file the scan is isolated), MSLEVEL (1 or 2), TIC, PEPMASS_DEV (ppm error for detected precursor mass) and SCANNUMBER (scan number in raw chromatogram). The last three columns were PARAM_ALGORITHM (algorithm of processing), PARAM_CREATION_TIME (date and time when the MS record was added) and SCANS (unique identifier for each record)
consensus: Consensus spectral library by merging MS/MS spectra with the same ID.
network: Consensus spectral library transformed into a molecular network based on MS/MS spectral similarity.
Youzhong Liu, YLiu186@ITS.JNJ.com
## Not run: library(RMassBankData)
input_library = NULL # There's no historical spectral library. We create a brand new spectral library here,
lcms_files <- list.files(system.file("spectra", package="RMassBankData"), ".mzML", full.names = TRUE)
metadata_file <- list.files(system.file(package = "MergeION"),".csv", full.names = TRUE)
polarity = "Positive"
mslevel= 2 # Only MS2 scans are extracted!
add.adduct = FALSE # No additional adducts are searched besides M+H
params.search = list(mz_search = 0.005, ppm_search = 10, rt_search = 15, rt_gap = 30)
params.ms.preprocessing = list(normalized = T, baseline = 1000, relative =0.01, max_peaks = 200, recalibration = 0)
# Building a spectral library with default (SmartION) algorithm by simply gathering scans that matched with metadata:
params.user = list(sample_type = "RMassBank data", user_name = "daniel", comments = "default algorithm, without building a consensus library")
processing.algorithm = "Default"
lib = library_generator(input_library, lcms_files, metadata_file,
polarity = "Positive", mslevel, add.adduct, processing.algorithm,
params.search, params.ms.preprocessing, params.user = params.user)
lib1 = lib$complete
save(lib1, file = "test_default_complete.RData") # Save the library as RData
# Building a spectral library with compMS2Miner algorithm and generating a consensus spectral library
processing.algorithm = "compMS2Miner"
params.consensus = list(consensus = T, consensus_method = "consensus", consensus_window = 0.02)
params.user = list(sample_type = "RMassBank data", user_name = "daniel", comments = "compMS2Miner algorithm, building a consensus library")
lib = library_generator(input_library, lcms_files, metadata_file,
polarity = "Positive", mslevel, add.adduct, processing.algorithm,
params.search, params.ms.preprocessing, params.consensus, params.user = params.user)
lib2 = lib$consensus
save(lib2, file = "test_compMS2Miner_consensus.RData") # Save the library as RData
# Building a spectral library with RMassBank algorithm (recalibration based on elemental formula annotation), creating consensus spectral library and building a molecular network based on the consensus library
processing.algorithm = "RMassBank"
params.ms.preprocessing = list(normalized = T, baseline = 1000, relative =0.01, max_peaks = 200, recalibration = 2)
params.network = list(network = T, similarity_method = "Cosine", min_frag_match = 6, min_score = 0.6, max_comp_size = 100, topK = 10, reaction_type = "Chemical", use_reaction = F)
lib3 = library_generator(input_library, lcms_files, metadata_file,
polarity = "Positive", mslevel, add.adduct, processing.algorithm,
params.search, params.ms.preprocessing, params.consensus, params.network, params.user = params.user)
save(lib3, file = "test_RMassBank_consensus_network.RData")
## End(Not run) # Save the library as RData
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.