match_features: Extract feature intensity values from unidentified feature...

match_featuresR Documentation

Extract feature intensity values from unidentified feature data

Description

Matches unidentified feature intensities with identified feature clusters. The mass and retention time of each unidentified feature is compared to the centroid of each identified cluster. If the unidentified feature falls within the specified mass and retention time threshold it is added to the cluster.

Usage

match_features(ms2, ms1, errors, n_mme_sd, n_rt_sd, summary_fn = "max")

Arguments

ms2

A data.table containing the identified feature data.

ms1

A data.table containing the unidentified feature data. This data is created from the files output by TopPIC ending with "ms1.feature" and must contain the RTalign and RecalMass variables. These variables are created with the unidentified feature data in the align_rt and recalibrate_mass functions.

errors

A list output from the calc_error function. The first element of the list contains the standard deviation and median of the mass measurement error for each data set. The second element is the standard deviation of the mass measurement error across all data sets. The third element is the standard deviation of the retention time in seconds across all data sets.

n_mme_sd

A numeric value indicating the number of standard deviations to use when creating a cutoff in the mass dimension. This threshold is used for determining whether an unidentified feature is close enough in mass to be considered part of an identified feature cluster. The mean RecalMass of all points in the cluster is used for the comparison.

n_rt_sd

A numeric value representing the number of standard deviations to use when creating a retention time cutoff. This value is the threshold used to determine if an unidentified feature is close enough in retention time to be considered part of an identified feature cluster. The mean retention time of all points in the cluster is used for the comparison.

summary_fn

A character string specifying the function to use when summarizing the feature intensity. The function must contain the na.rm argument. Some examples of functions that are allowed are max, sum, or median.

Value

A data.table containing all unidentified features that fall within the threshold of an identified feature gene/cluster combination.

Author(s)

Evan A Martin


evanamartin/TopPICR documentation built on Dec. 9, 2022, 8:05 p.m.