cdataMake | R Documentation |
Builds one large master dataset given the directory where a dataset collection lives
cdataMake(
datadir = NULL,
files = NULL,
keyname = "ID",
filterfun = NULL,
namespacefun = defaultIndex
)
datadir |
Directory hosting the collection of datasets. If given, this will try to use all files. For only selected files, use 'files' parameter. |
files |
A vector of dataset file paths to read. This allows specifying a subset of files that are possibly spread throughout different directories. Must be given if 'datadir' is not given, and ignored if 'datadir' is given. |
keyname |
Tables are merged using this key column. |
filterfun |
Optional, a filter function that returns selected columns within a file to be included in the final master dataset, such as to include only numeric columns. |
namespacefun |
Optional, a function to make unique namespaces. If not given, defaults to namespacing using filenames. See details. |
This compiles the cdata
data object from a collection of datasets.
Each dataset is a uniquely named .csv|.tsv|.txt file within the specified directory.
The files are read and merged together into one master data.table
.
Because column IDs must be unique in the table, namespaced IDs are created using the parent file name.
A function can be passed into indexfun
for some control of this namespace index approach.
For instance, instead of using the full file name, one might need to map it to a shorter key,
pre-existing uuid, or other external key (as long as unique IDs can still be ensured),
e.g. a data feature "Var1" from file "PMID123456_Doe-2000.txt" is column named "Doe00_Var1"
in the master data table.
A "master" data.table
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.