readMSData: Imports mass-spectrometry raw data files as 'MSnExp'...

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/readMSData.R

Description

Reads as set of XML-based mass-spectrometry data files and generates an MSnExp object. This function uses the functionality provided by the mzR package to access data and meta data in mzData, mzXML and mzML.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
readMSData(
  files,
  pdata = NULL,
  msLevel. = NULL,
  verbose = isMSnbaseVerbose(),
  centroided. = NA,
  smoothed. = NA,
  cache. = 1L,
  mode = c("inMemory", "onDisk")
)

Arguments

files

A character with file names to be read and parsed.

pdata

An object of class AnnotatedDataFrame or NULL (default).

msLevel.

MS level spectra to be read. In inMemory mode, use 1 for MS1 spectra or any larger numeric for MSn spectra. Default is 2 for InMemory mode. onDisk mode supports multiple levels and will, by default, read all the data.

verbose

Verbosity flag. Default is to use isMSnbaseVerbose().

centroided.

A logical, indicating whether spectra are centroided or not. Default is NA in which case the information is extracted from the raw file (for mzML or mzXML files). In onDisk, it can also be set for different MS levels by a vector of logicals, where the first element is for MS1, the second element is for MS2, ... See OnDiskMSnExp for an example.

smoothed.

A logical indicating whether spectra already smoothed or not. Default is NA.

cache.

Numeric indicating caching level. Default is 0 for MS1 and 1 MS2 (or higher). Only relevant for inMemory mode.

mode

On of "inMemory" (default) or "onDisk". The former loads the raw data in memory, while the latter only generates the object and the raw data is accessed on disk when needed. See the benchmarking vignette for memory and speed implications.

Details

When using the inMemory mode, the whole MS data is read from file and kept in memory as Spectrum objects within the MSnExp'es assayData slot.

To reduce the memory footpring especially for large MS1 data sets it is also possible to read only selected information from the MS files and fetch the actual spectrum data (i.e. the M/Z and intensity values) only on demand from the original data files. This can be achieved by setting mode = "onDisk". The function returns then an OnDiskMSnExp object instead of a MSnExp object.

Value

An MSnExp object for inMemory mode and a OnDiskMSnExp object for onDisk mode.

Note

readMSData uses normalizePath to replace relative with absolute file paths.

Author(s)

Laurent Gatto

See Also

readMgfData() to read mgf peak lists.

Examples

1
2
3
4
5
6
7
file <- dir(system.file(package = "MSnbase", dir = "extdata"),
            full.name = TRUE,
            pattern = "mzXML$")
mem <- readMSData(file, mode = "inMemory")
mem
dsk <- readMSData(file, mode = "onDisk")
dsk

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: S4Vectors
Loading required package: stats4

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: ProtGenerics

Attaching package:ProtGenericsThe following object is masked frompackage:stats:

    smooth


This is MSnbase version 2.16.0 
  Visit https://lgatto.github.io/MSnbase/ to get started.


Attaching package:MSnbaseThe following object is masked frompackage:base:

    trimws

MSn experiment data ("MSnExp")
Object size in memory: 0.18 Mb
- - - Spectra data - - -
 MS level(s): 2 
 Number of spectra: 5 
 MSn retention times: 25:1 - 25:2 minutes
- - - Processing information - - -
Data loaded: Thu Dec  3 22:24:32 2020 
 MSnbase version: 2.16.0 
- - - Meta data  - - -
phenoData
  rowNames: dummyiTRAQ.mzXML
  varLabels: sampleNames
  varMetadata: labelDescription
Loaded from:
  dummyiTRAQ.mzXML 
protocolData: none
featureData
  featureNames: F1.S1 F1.S2 ... F1.S5 (5 total)
  fvarLabels: spectrum
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
MSn experiment data ("OnDiskMSnExp")
Object size in memory: 0.03 Mb
- - - Spectra data - - -
 MS level(s): 2 
 Number of spectra: 5 
 MSn retention times: 25:1 - 25:2 minutes
- - - Processing information - - -
Data loaded [Thu Dec  3 22:24:32 2020] 
 MSnbase version: 2.16.0 
- - - Meta data  - - -
phenoData
  rowNames: dummyiTRAQ.mzXML
  varLabels: sampleNames
  varMetadata: labelDescription
Loaded from:
  dummyiTRAQ.mzXML 
protocolData: none
featureData
  featureNames: F1.S1 F1.S2 ... F1.S5 (5 total)
  fvarLabels: fileIdx spIdx ... spectrum (35 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'

MSnbase documentation built on Jan. 23, 2021, 2 a.m.