clean_msdial_data: Combine pos and neg files from MSDial and filter peaks...

View source: R/clean_msdial_data.R

clean_msdial_dataR Documentation

Combine pos and neg files from MSDial and filter peaks according to user parameters

Description

Combine pos and neg files from MSDial and filter peaks according to user parameters

Usage

clean_msdial_data(
  filter_blk = TRUE,
  filter_blk_threshold = 0.8,
  filter_blk_ghost_peaks = TRUE,
  filter_mz = TRUE,
  filter_rsd = TRUE,
  filter_rsd_threshold = 30,
  filter_rmd = TRUE,
  filter_rmd_range = c(50, 3000),
  threshold_mz = 0.05,
  threshold_rt = 0.1,
  user_pos_adducts_refs = NA,
  user_neg_adducts_refs = NA,
  user_pos_neutral_refs = NA,
  user_neg_neutral_refs = NA,
  compute_pearson_correlation = FALSE,
  pearson_correlation_threshold = 0.8,
  pearson_p_value = 0.05
)

Arguments

filter_blk

A boolean indicating whether or not to delete rows with too much noise.

filter_blk_threshold

A numerical threshold for noise filtering: rows with ratio mean(blank columns)/mean(qc columns) >= filter_blk are deleted (if there are no QC columns in the sample, the mean of standard columns is used, or the mean of all non-blank samples if needed).

filter_blk_ghost_peaks

A boolean indicating whether or not to delete blank ghost peaks (only used if filter_blk is TRUE, see publication for more information).

filter_mz

A boolean indicating whether or not to delete rows with masses ending in .8 or .9 (masses not found in natural products).

filter_rsd

A boolean indicating whether or not to delete rows with too much relative standard deviation in each class.

filter_rsd_threshold

A numerical threshold for relative standard deviation filtering: rows with relative standard deviation >= filter_rsd_threshold in each class are deleted. Only used if filter_rsd is TRUE.

filter_rmd

A boolean indicating whether or not to delete rows with a Relative Mass Defect outside of the range provided in filter_rmd_range.

filter_rmd_range

A range of 2 integers indicating the acceptable Relative Mass Defects in ppm (only used if filter_rmd is TRUE).

threshold_mz

A numerical value indicating the mass tolerance in Dalton for the detection of adducts and neutral losses.

threshold_rt

A numerical value indicating the retention time tolerance.

user_neg_neutral_refs, user_pos_neutral_refs, user_pos_adducts_refs, user_neg_adducts_refs

An optional 2-column data.frame containing information about neutral losses, positive adducts or negative adducts (one column for the name and one column for the mass difference with the base compound). If no data.frame is provided, the package default list is used.

compute_pearson_correlation

Compute Pearson correlation between peaks to detect clusters.

pearson_correlation_threshold

Ignore links having a Pearson correlation < threshold (default: 0.8).

pearson_p_value

Ignore links having a non-significative Pearson correlation (default: 0.05).

Architecture needed by clean_msdial_data in the project directory

  • pos

    Normalized-<...>.txt

    Positive peaks info exported from MSDial

    peaks

    Positive peaks files exported from MSDial

  • neg

    Normalized-<...>.txt

    Negative peaks info exported from MSDial

    peaks

    Negative peaks files exported from MSDial

Output files of clean_msdial_data in the project directory

  • pos

    filtered_peaks

    Positive peaks files remaining after cleaning, copied from the folder pos/peaks

  • neg

    filtered_peaks

    Negative peaks files remaining after cleaning, copied from the folder neg/peaks

  • intermediary_data

    samples.csv

    Information about samples

    adducts.graphml

    Graph of adducts relations between peaks

    clusters.csv

    MSDial data with clusters information

    clusters.graphml

    Graph containing clusters and MSDial data

    MS_peaks-final.csv

    Peaks remaining after all filters have been applied

    links-*.csv

    Files containing information about links between peaks.


eMetaboHUB/MS-CleanR documentation built on Jan. 3, 2024, 8:55 p.m.