chooseK_seq2seq: Choose the number of autoencoder features

Description Usage Arguments Value See Also

View source: R/feature_extraction.R

Description

chooseK_seq2seq chooses the number of features to be extracted by cross-validation.

Usage

1
2
3
4
chooseK_seq2seq(seqs, ae_type, K_cand, rnn_type = "lstm", n_epoch = 50,
  method = "last", step_size = 1e-04, optimizer_name = "adam",
  n_fold = 5, cumulative = FALSE, log = TRUE, weights = c(1, 0.5),
  valid_prop = 0.1, verbose = TRUE)

Arguments

seqs

an object of class "proc".

ae_type

a string specifies the type of autoencoder. The autoencoder can be an action sequence autoencoder ("action"), a time sequence autoencoder ("time"), or an action-time sequence autoencoder ("both").

K_cand

the candidates of the number of features.

rnn_type

the type of recurrent unit to be used for modeling response processes. "lstm" for the long-short term memory unit. "gru" for the gated recurrent unit.

n_epoch

the number of training epochs for the autoencoder.

method

the method for computing features from the output of an recurrent neural network in the encoder. Available options are "last" and "avg".

step_size

the learning rate of optimizer.

optimizer_name

a character string specifying the optimizer to be used for training. Availabel options are "sgd", "rmsprop", "adadelta", and "adam".

n_fold

the number of folds for cross-validation.

cumulative

logical. If TRUE, the sequence of cumulative time up to each event is used as input to the neural network. If FALSE, the sequence of inter-arrival time (gap time between an event and the previous event) will be used as input to the neural network. Default is FALSE.

log

logical. If TRUE, for the timestamp sequences, input of the neural net is the base-10 log of the original sequence of times plus 1 (i.e., log10(t+1)). If FALSE, the original sequence of times is used.

weights

a vector of 2 elements for the weight of the loss of action sequences (categorical_crossentropy) and time sequences (mean squared error), respectively. The total loss is calculated as the weighted sum of the two losses.

valid_prop

the proportion of validation samples in each fold.

verbose

logical. If TRUE, training progress is printed.

Value

chooseK_seq2seq returns a list containing

K

the candidate in K_cand producing the smallest cross-validation loss.

K_cand

the candidates of number of features.

cv_loss

the cross-validation loss for each candidate in K_cand.

See Also

seq2feature_seq2seq for feature extraction given the number of features.


ProcData documentation built on April 1, 2021, 5:07 p.m.