linear_transformation: Full Alignment of Peak Lists by linear retention time...

View source: R/linear_transformation.R

linear_transformationR Documentation

Full Alignment of Peak Lists by linear retention time correction.

Description

Shifts all peaks within samples to maximise the similarity to a reference sample. For optimal results, a sufficient number of shared peaks are required to find a optimal solution. A reference needs to be specified, for instance using choose_optimal_reference. Linear shifts are evaluated within a user-defined window in discrete steps. The highest similarity score defines the shift that will be applied. If more than a single shift step yields to the same similarity score, the smallest absolute value wins in order to avoid overcompensation. The functions is envoked internally by align_chromatograms.

Usage

linear_transformation(
  gc_peak_list,
  reference,
  max_linear_shift = 0.05,
  step_size = 0.01,
  rt_col_name,
  Logbook = NULL
)

Arguments

gc_peak_list

List of data.frames. Each data.frame contains GC-data (e.g. retention time, peak area, peak height) of one sample. Variables are stored in columns. Rows represent distinct peaks. Retention time is a required variable.

reference

A character giving the name of a sample included in the dataset. All samples are aligned to the reference.

max_linear_shift

Numeric value giving the window size considered in the full alignment. Usually, the amplitude of linear drift is small in typical GC-FID datasets. Therefore, the default value of 0.05 minutes is adequate for most datasets. Increase this value if the drift amplitude is larger.

step_size

Integer giving the step size in which linear shifts are evaluated between max_linear_shift and -max_linear_shift.

rt_col_name

A character giving the name of the column containing the retention times. The decimal separator needs to be a point.

Logbook

A list. If present, a summary of the applied linear shifts in full alignments of peak lists is appended to the list. If not specified, a list will be created automatically.

Details

A similarity score is calculated as the sum of deviations in retention times between all reference peaks and the closest peak in the sample. The principle idea is that the appropriate linear transformation will reduce the deviation in retention time between homologous peaks, whereas all other peaks should deviate randomly. Among all considered shifts, the minimum deviation score is selected for subsequent full alignment by shifting all peaks of the sample by the same value.

Value

A list containing two items.

chroma_aligned

List containing the transformed data

Logbook

Logbook, record of the applied shifts

Author(s)

Martin Stoffel (martin.adam.stoffel@gmail.com) & Meinolf Ottensmann (meinolf.ottensmann@web.de)

Examples

dat <- peak_data[1:10]
dat <- lapply(dat, function(x) x[1:50,])
x <- linear_transformation(gc_peak_list = dat, reference = "C2", rt_col_name = "time")


GCalignR documentation built on Feb. 16, 2023, 5:23 p.m.