mz_group: Group m/z values into bins of a specified ppm width

View source: R/extraHelperFunctions.R

mz_groupR Documentation

Group m/z values into bins of a specified ppm width

Description

This function bins m/z values based on their proximity to each other in m/z space. The algorithm takes the first value in the m/z vector and uses that as the center of a window with a ppm value provided by the user and assigns all m/z values within that window to the same group, then removes those values from consideration and repeats the process until there are no points left to group. This is often used to construct chromatograms from raw MS data that can then be visualized or peakpicked. The function can also drop groups of m/z values if there's not enough points within them or produce only a certain number of groups. Because the algorithm uses the first value in the m/z vector as the window center, it's often a good idea to first sort the values by decreasing intensity.

Usage

mz_group(mz_vals, ppm, min_group_size = 0, max_groups = NULL)

Arguments

mz_vals

A numeric vector of m/z values

ppm

A length-1 numeric vector specifying the desired window size in ppm

min_group_size

A length-1 numeric vector specifying the minimum number of points that must fall within an m/z window to be assigned a group number

max_groups

A length-1 numeric vector specifying the maximum number of total groups to assign.

Value

A numeric vector of the same length as mz_vals specifying the group into which each m/z value was binned. Values not assigned to a group are returned as NAs.

Examples


example_mz_vals <- c(118.0, 118.1, 138.0, 152.0, 118.2, 138.1, 118.1)
mz_group(example_mz_vals, ppm = 1)
mz_group(example_mz_vals, ppm = 1000)
mz_group(example_mz_vals, ppm = 200000)

mz_group(example_mz_vals, ppm = 1000, min_group_size = 2)
mz_group(example_mz_vals, ppm = 1000, max_groups = 2)

## Not run: 
sample_dir <- system.file("extdata", package = "RaMS")
sample_files <- list.files(sample_dir, full.names=TRUE)
msdata <- grabMSdata(sample_files[c(3, 5, 6)], grab_what="MS1")

grouped_MS1 <- msdata$MS1[mz%between%pmppm(119.0865, 100)][
 order(int, decreasing = TRUE)][
   ,mz_group:=mz_group(mz, ppm = 5)][]
print(grouped_MS1)

library(ggplot2)
library(dplyr)
msdata$MS1[mz%between%pmppm(119.0865, 100)] %>%
  arrange(desc(int)) %>%
  mutate(mz_group=mz_group(mz, ppm=10)) %>%
  ggplot() +
  geom_point(aes(x=rt, y=mz, color=factor(mz_group)))

msdata$MS1[mz%between%pmppm(119.0865, 100)] %>%
  arrange(desc(int)) %>%
  mutate(mz_group=mz_group(mz, ppm=5)) %>%
  qplotMS1data(facet_col = "mz_group")
msdata$MS1[mz%between%pmppm(119.0865, 100)] %>%
  arrange(desc(int)) %>%
  mutate(mz_group=mz_group(mz, ppm=5, max_groups = 2)) %>%
  qplotMS1data(facet_col = "mz_group")

## End(Not run)

RaMS documentation built on Oct. 9, 2024, 9:06 a.m.