prepare_dataset | R Documentation |
Find files matching a pattern in a given directory, and build a data frame of
standard sample attributes from fields in the filenames. Nested directory
structures are supported. Alternatively, use load_dataset to load a
spreadsheet of sample attributes explicitly. load_dataset
can be used for
cases where more than one locus is to be analyzed from a single sequencer
sample (i.e., multiplexed samples), though the locusmap
argument here can
allow automatic matching of locus names for multiplexed samples. If the
directory path given does not exist or if no matching files are found, an
error is thrown.
prepare_dataset(
dp = cfg("prep_dataset_path"),
pattern = cfg("prep_dataset_pattern"),
ord = cfg("prep_dataset_order"),
autorep = cfg("prep_dataset_autorep"),
locusmap = NULL
)
dp |
directory path to search for matching data files. |
pattern |
regular expression to use for parsing filenames. There should be exactly three groups in the pattern, for Replicate, Sample, and Locus. |
ord |
integer vector giving order of the fields Replicate, Sample, and
Locus in filenames. For example, if Locus is the first field followed by
Replicate and Sample, set |
autorep |
logical allowing for automatic handling of any duplicates found, labeling them as replicates. FALSE by default. |
locusmap |
list of character vectors, each list item name being the
locus text given in the filenames, and each vector being a set of separate
locus names. Each entry with a locus name text matching one of these list
items will be replaced in the final output with several separate entries,
one for each locus name in the corresponding vector. (For example,
|
data frame of metadata for all files found
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.