ncdf4_data_extract: Extract data ranges from a NetCDF file format 4

ncdf4_data_extractR Documentation

Extract data ranges from a NetCDF file format 4

Description

A flexible and convenient function to extract raw data from a NetCDF file format 4 using time ranges and m/z ranges or values. This is a better (and faster) alternative to the old peakCDFextraction() function, which reads the whole CDF into memory, specially when only sections of the CDF are needed.

Usage

ncdf4_data_extract(cdfFile, massValues, timeRange, useRT = FALSE)

Arguments

cdfFile

A path to a NetCDF file format 4.

massValues

A numeric vector representing m/z values.

timeRange

A numeric vector of length 2 representing the lower and upper time limits.

useRT

Logical. If TRUE, the time range is in seconds, otherwise if FALSE (default), the time range is in retention time index units (TRUE).

Details

The function takes a NetCDF format 4 generated by "TargetSearch" and extracts raw intensity values from given m/z ion traces within a given time range. The time range can be in seconds or arbitrary retention time index units. For the latter case, the function expects a time corrected CDF file.

If the given time range is out of range, a NULL value will be returned. In contrast, if the m/z values are out of range, then zeros will be returned for out of range masses (provided that the time range is not out of range). If timeRange is missing, then the whole time range. Similarly, if massRange is missing, then all the masses are extracted. If both are missing, the function behaves as peakCDFextraction(), but the output (a named list) uses slightly different names.

The NetCDF must be have been previously converted to the custom "TargetSearch" format, otherwise an error will be raised. See ncdf4_convert() for the conversion.

Value

A named list with the following components.

  • Time Numeric vector: the retention time in seconds

  • Index Numeric vector: the retention time index (or zero if the file was was not retention time corrected

  • Intensity Matrix: Rows are the retention times (or scans) and columns are masses.

  • massRange Numeric vector of length 2: the mass range of the CDF

Note

An error will be produced for invalid files. Also, this function is intended to be used internally, but it is exposed for convenience, for example, to create custom plots.

Author(s)

Alvaro Cuadros-Inostroza

See Also

ncdf4_convert(), peakCDFextraction()

Examples

# If you already have a NCDF4 file set the variable accordingly
# and skip the following section
# nc4file <- "/path/to/netcdf4.nc4"

### create NCDF4 from NCDF3 from the package TargetSearchData
require(TargetSearchData)
# pick first file
cdffile <- tsd_cdffiles()[1]
nc4file <- tempfile(fileext=".nc4")
ncdf4_convert(cdffile, nc4file)

# update RI data using pre-computed values
ncdf4_update_ri(nc4file, c(252, 311, 269), c(262320, 323120, 381020))
### end

# extract all data (behaves like peakCDFextraction)
data <- ncdf4_data_extract(nc4file)

# extract only certain m/z values
data <- ncdf4_data_extract(nc4file, massValues=c(116, 192))

# to use mass ranges, use the colon (:) operator for example
data <- ncdf4_data_extract(nc4file, massValues=c(120:130, 200, 203:209))

# restrict extraction to a retention index interval
data <- ncdf4_data_extract(nc4file, massValues=c(116, 192),
                           timeRange=c(200000, 220000))

# same, but using retention time in seconds.
data <- ncdf4_data_extract(nc4file, massValues=c(116, 192),
                           timeRange=c(200, 300), useRT=TRUE)

acinostroza/TargetSearch documentation built on May 5, 2024, 10:14 a.m.