README.md
In stschn/deepANN: Neural Network Toolbox

Machine and Deep Learning Toolbox

This R library is currently structured in form of the following functional families:

Dummy

dummify() creates dummy variables for non-metric variables.
dummify_multilabel() creates dummy variables for multi-label character variables.
effectcoding() changes a binary 0/1 encoded variable into a -1/1 encoded variable.
sparse_encode() builds a simple numeric encoded array.
one_hot_encode() builds a one-hot vector in form of a matrix.
one_hot_decode() restores a one-hote encoded matrix to a single vector.
append_rows() appends dummy rows.
remove_columns() removes columns with only one specific value.

Encoder

class Encoder is the base class for all encoding classes.
class LabelEncoder used for encoding labels between 0 and n_labels-1.
class OneHotEncoder used for encoding labels as a one-hot numeric array.
class LabelBinarizer used for encoding a categorical variable as a label indicator matrix.
class MultiLabelBinarizer used for encoding multi-labels as a multiple binary numeric matrix.

Sampler The classes based on R6 class Sampler offer variant ways to balance data sets.

class OverSampler.
oversampling() a class wrapper function.
class RandomOverSampler.
random_oversampling() a class wrapper function.
class SMOTE.
smote() a class wrapper function.
class UnderSampler.
undersampling() a class wrapper function.
class RandomUnderSampler.
random_undersampling() a class wrapper function.

Selection

data_split() splits data into train and test subsets.
train_test_split() splits a sequence of data into random train and test subsets.

Outlier

outlier() detects and optionally replaces outliers thru the methods quartiles (from Tukey (1977)), mean (maximum likelihood estimation) or median (scaled median absolute deviation).
outlier_dataset() replaces outliers within a data set.
winsorize() winsorizes a numeric vector.

Scaling

class Scaler is the base class for all scaler classes.
class MinMaxScaler used for transforming features by scaling each feature to a given range.
class StandardScaler used for transforming features by removing the mean and scaling to unit standard deviation.
scale_minmax() scales a numeric vector thru min-max scaling.
scale_zscore() scales a numeric vector thru z-score scaling.
scale_center() scales a numeric vector thru (mean) centering.
scale_log() scales a numeric vector thru log transformation.
scaling() encapsulates the different types of scaling.
scale_dataset() scales a data set with a specific scale type.
scale_train_test() scales a train and a test data set with a specific scale type.

Time Series

get_season() delivers corresponding seasons for a given date vector.
lags() builds a lagged data set.
stationary() creates a stationary data series thru differencing.
invert_differencing() inverts a differenced data series.
diffinv_simple() inverts a simple-differenced data series.
diff_log() creates a log-differenced data series.
diffinv_log() inverts a log-differenced data series.
diff_percentage() creates a percentage-differenced data series.
diffinv_percentage() inverts a percentage-differenced data series.
period()subsets a data set/time series to periodically specified values.
partition()subsets a data set/time series into several slices.

Metrics

stderror() calculates the standard error.
sse() calculates the sum of squared errors.
mae() calculates the mean absolute error.
mape() calculates the mean absolute percentage error.
wmape() calculates the weighted mean absolute percentage error.
wape() calculates the weighted average percentage error.
mse() calculates the mean squared error.
msle() calculates the mean squared logarithmic error.
rmse() calculates the root mean square error.
rmsle() calculates the root mean square logarithmic error.
rmspe() calculates the root mean square percentage error.
huber_loss() calculates the Huber loss.
log_cosh_loss() calculates the log-cosh loss.
quantile_loss() calculates the quantile loss.
vc() calculates the variance coefficient.
accuracy() calculates the accuracy for a single-label or multi-label classification task.
dice() calculates the Dice coefficient.
iou() calculates the Intersection-over-Union (IoU) coefficient.
gini_impurity() calculates the Gini impurity.
entropy() calculates the Shannon entropy.
cross_entropy() calculates the cross-entropy.
erf() defines error function (from MATLAB).
erfc() defines complementary error function (from MATLAB).
erfinv() defines inverse error function (from MATLAB).
erfcinv() defines inverse complementary error function (from MATLAB).

Utils

re.factor() renews a factor object.
var_pop() calculates the population variance.
sd_pop() calculates the population standard deviation.
radian() converts degrees to radians.
degree() converts radians to degrees.
distance() calculates the distance between two numeric vectors.
similarity() calculates the similarity between two numeric or logical vectors.
probability() computes the probability of a categorical or continuous variable.
vector_as_numeric() converts a vector into a vector with numeric values.
list_as_numeric() recursively transforms the objects of a list into numeric values.
as_ANN_matrix() converts a data set into a matrix with adjusted character values and factor levels to their numeric indices if necessary.
vector_as_ANN_matrix() transforms a vector into a ANN compatible matrix.
random_seed() random number generator for reproducible results with Tensorflow/Keras.

Machine Learning (ML)

cross_validation_split() splits an entire data set into k folds.
naive_forecast() predicts values for a data series based upon random walk without and with drifts.
k_nearest_neighbors() identifies the categorical or continuous response and probability distributions of k nearest neighbors where appropriate of a query instance.
moving_average() calculates the (weighted) moving average.
naive_bayes() and predict.naivebayes() computes and predicts numeric values for classification solutions based on Bayes' theorem.
decision_tree() and predict.decisiontree() builds up a decision tree and predicts categorical values for classification solutions. treeheight() computes the height of a tree, treedepth() the depth of a tree.
predict.kmeans() predicts kmeans cluster for feature data.

Single & Multi Layer Perceptron (SLP, MLP)

nsamples() extracts the number of samples within a data structure, usually a tensor.
nunits() extracts the number of units within a data structure, usually a tensor.
ntimesteps() extracts the number of timesteps within a data structure, usually a tensor.
nsubsequences() extracts the number of subsequences within a data structure, usually a tensor.
as_tensor_1d() transforms data into a one-dimensional tensor (vector).
as_tensor_2d() transforms data into a two-dimensional tensor (matrix).
as_tensor_3d() transforms data into a three-dimensional tensor.
as_MLP_X() creates a 2D feature array with the dimensions samples and units.
as_MLP_Y() creates a 2D outcome array with the dimensions samples and units for a metric outcome or a one-hot vector for a non-metric outcome.
build_MLP() builds a sequential SLP/MLP model with stacked dense layers and optionally dropout layers.
fit_MLP() encapsulates fitting a SLP/MLP model.
save_weights_ANN() saves the weights of a ANN into a HDF5-file.
load_weights_ANN() loads the weights of a ANN from a HDF5-file.

Reccurent Neural Network (RNN)

get_LSTM_XY() extracts features and outcomes from a data set in a LSTM compatible preformat.
get_period_shift() calculates the period shift for a univariate and multivariate time series.
start_invert_differencing() determines the start index for invert differencing.
as_lag() transfers a lag from ARIMA(X) to a corresponding lag used for LSTM modeling.
as_timesteps() transfers a lag to a corresponding timesteps value.
as_LSTM_X() resamples a feature matrix into a 3D feature array with the dimensions samples, timesteps and units.
as_LSTM_Y() creates either a 2D array with dimensions samples and units respectively a 3D array with dimensions samples, timesteps and units for a metric outcome or a one-hot vector for a non-metric outcome.
as_LSTM_data_frame() restructures a resampled feature matrix and an outcome matrix to a data.frame.
build_LSTM() builds a sequential LSTM model with stacked LSTM layers and optionally dropout layers.
fit_LSTM() encapsulates fitting a LSTM model.
predict_ANN() predicts with different ANN models like SLP/MLP or LSTM.
as_LSTM_period_outcome() returns a data.frame with period column and actual outcome column for quality assurance and graphical illustration purposes.

Convolutional Neural Network (CNN)

images_load() load images from different sources.
images_resize() resizes loaded images.
as_images_array() converts image representation into 3D array.
as_images_tensor() builds an image tensor of corresponding shape depending on the type of images (2D or 3D images).
as_CNN_image_X() creates a 4D image feature array with the dimensions samples, height, width and channels either from already given image data or from images on a storage medium.
as_CNN_image_Y() creates a one-hot vector for the image labels.
as_CNN_temp_X() resamples a feature matrix into a 3D feature array with the dimensions samples, timesteps and units or into a 4D array with the dimensions samples, subsequences, timesteps and features.
as_CNN_temp_Y() creates either a 2D outcome array with the dimensions samples and units respectively a 3D array with dimensions samples, timesteps and units for a metric outcome or a one-hot vector for a non-metric outcome.
lenet5() build a CNN model from type LeNet-5.
alexnet() build a CNN model from type AlexNet.
zfnet() build a CNN model from type ZFNet.
vgg16() build a CNN model from type VGG-16.
vgg19() build a CNN model from type VGG-19.
resnet50() build a CNN model from type ResNet-50.
inception_v3() build a CNN model from type Inception v3.
inception_resnet_v2() build a CNN model from type Inception-ResNet v2.
mobilenet() build a CNN model from type MobileNet.
mobilenet_v2() build a CNN model from type MobileNetV2.
mobilenet_v3() build a CNN model from type MobileNetV3.
xception() build a CNN model from type Xception.
nasnet() build a CNN model from type NASNet-A.
unet() build a CNN model from type U-Net.
unet3d() build a CNN model from type 3D U-Net.