Description Usage Arguments Value Author(s)
View source: R/clean_dataset.R
A function for scrubbing a datasetset for usage with most standard algorithms. This involves one-hot-encoding columns that are probably categorical.
1  | clean.dataset(dataset, clean.invalid = TRUE, clean.ohe = FALSE)
 | 
dataset | 
 a list with at least the following key-worded elements: 
  | 
clean.invalid | 
 whether to remove samples with invalid entries. Defaults to  
  | 
clean.ohe | 
 options for whether to one-hot-encode columns. Defaults to  
  | 
A list containing at least the following key-worded elements:
X[m, d+r] the array with m samples in d+r dimensions, where r is the number of additional columns appended for encodings. m < n when  there are non-finite or NaN entries. colnames(dataset) returns the column names of the cleaned columns.
Y[m, r] matrix or [n] vector containg regressors or class labels for samples in X. m < n when there are non-finite or NaN entries.
samplesm the sample ids that are included in the final array, where samp[i] is the original row id corresponding to Xc[i,]. If m < n, there were non-finite or NaN entries that were purged.
Eric Bridgeford
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.