process_GM cleans up various aspects of raw GeneMapper-output
peak data files and converts them to a nicer data structure. It also
optionally formats them for use with the online tool T-REX.
A character vector with one element, containing the filepath (absolute or relative to working directory) of the GeneMapper peaks file.
Should files containing peaks and label data be written to disk
for uploading to TREX? If TRUE, files will be created and written to the
working directory as "TREX_peaks.txt" and TREX_label.txt", unless an
alternative is supplied to write_path (see below). The label file will
contain a column FileName as required by TREX, and then a column for
sample_ref and plate_well as created by function
A named character vector. Elements should be names for the
targets of the TRFLP, while names must be the capitalised first letter of
the dye colour as referenced by GeneMapper. For example, for a primer pair
with an attached red fluorophore and which targets domain Archaea, the
corresponding entry would be
An additional label file (in .csv format) containing
additional informative columns to be included in the TREX label file. Must
include one column of filenames named
A character vector containing the names of columns to be retained from the original GeneMapper peak file. Used to remove blank columns. Included for potential expansion to allele and marker data; for now, avoid passing a value to this argument.
A character vector with one element, specifying the filepath for writing the TREX label and peak files. Defaults to the working directory. Note that supplied alternatives can include a prefix e.g. "C:/user/data/set1_", which will result in files names "set1_TREX_label.txt" and "set1_TREX_peaks.txt" in directory "C:/user/data/". If you do not use a prefix, then the trailing backslash must be supplied e.g. "C:/user/data/"; supplying "C:/user/data" will result in files "dataTREX_label.txt" and "dataTREX_peaks.txt" in directory "C:/user/".
Peak data files exported from GeneMapper contain some empty or redundant columns (particuarly for TRFLP purposes) as well as poorly-formatted data. This function removes empty columns, separates peak and dye identifiers, and reformats a flat data frame into a nested list with two levels: sample (top level) and target (second level).
process_GM() can also write a T-REX label and peak file
which can be directly uploaded to the online TRFLP analysis tool TREX without
further manual formatting.
A list with two nestings: the top level is a list of samples, named
using their sample_ref as extracted from file_name by function
split_filename. The second level is a list of peak data frames named
by targets (as supplied to argument
targets) and containing peak
data, with rows numbered as the peak number extracted from the original
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.