Description Usage Arguments Value Author(s)
View source: R/clean_dataset.R
A function for scrubbing a datasetset for usage with most standard algorithms. This involves one-hot-encoding columns that are probably categorical.
1 | clean.dataset(dataset, clean.invalid = TRUE, clean.ohe = FALSE)
|
dataset |
a list with at least the following key-worded elements:
|
clean.invalid |
whether to remove samples with invalid entries. Defaults to
|
clean.ohe |
options for whether to one-hot-encode columns. Defaults to
|
A list containing at least the following key-worded elements:
X[m, d+r] the array with m samples in d+r dimensions, where r is the number of additional columns appended for encodings. m < n when there are non-finite or NaN entries. colnames(dataset) returns the column names of the cleaned columns.
Y[m, r] matrix or [n] vector containg regressors or class labels for samples in X. m < n when there are non-finite or NaN entries.
samplesm the sample ids that are included in the final array, where samp[i] is the original row id corresponding to Xc[i,]. If m < n, there were non-finite or NaN entries that were purged.
Eric Bridgeford
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.