| cdataMake | R Documentation |
Builds one large master dataset given the directory where a dataset collection lives
cdataMake(
datadir = NULL,
files = NULL,
keyname = "ID",
filterfun = NULL,
namespacefun = defaultIndex
)
datadir |
Directory hosting the collection of datasets. If given, this will try to use all files. For only selected files, use 'files' parameter. |
files |
A vector of dataset file paths to read. This allows specifying a subset of files that are possibly spread throughout different directories. Must be given if 'datadir' is not given, and ignored if 'datadir' is given. |
keyname |
Tables are merged using this key column. |
filterfun |
Optional, a filter function that returns selected columns within a file to be included in the final master dataset, such as to include only numeric columns. |
namespacefun |
Optional, a function to make unique namespaces. If not given, defaults to namespacing using filenames. See details. |
This compiles the cdata data object from a collection of datasets.
Each dataset is a uniquely named .csv|.tsv|.txt file within the specified directory.
The files are read and merged together into one master data.table.
Because column IDs must be unique in the table, namespaced IDs are created using the parent file name.
A function can be passed into indexfun for some control of this namespace index approach.
For instance, instead of using the full file name, one might need to map it to a shorter key,
pre-existing uuid, or other external key (as long as unique IDs can still be ensured),
e.g. a data feature "Var1" from file "PMID123456_Doe-2000.txt" is column named "Doe00_Var1"
in the master data table.
A "master" data.table
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.