aseq2feature_seq2seq: Feature Extraction by action sequence autoencoder
In ProcData: Process Data Analysis

Description Usage Arguments Details Value See Also Examples

aseq2feature_seq2seq extract features from action sequences by action sequence autoencoder.

aseq2feature_seq2seq(aseqs, K, rnn_type = "lstm", n_epoch = 50,
  method = "last", step_size = 1e-04, optimizer_name = "adam",
  samples_train, samples_valid, samples_test = NULL, pca = TRUE,
  verbose = TRUE, return_theta = TRUE)

`aseqs`	a list of `n` action sequences. Each element is an action sequence in the form of a vector of actions.
`K`	the number of features to be extracted.
`rnn_type`	the type of recurrent unit to be used for modeling response processes. `"lstm"` for the long-short term memory unit. `"gru"` for the gated recurrent unit.
`n_epoch`	the number of training epochs for the autoencoder.
`method`	the method for computing features from the output of an recurrent neural network in the encoder. Available options are `"last"` and `"avg"`.
`step_size`	the learning rate of optimizer.
`optimizer_name`	a character string specifying the optimizer to be used for training. Availabel options are `"sgd"`, `"rmsprop"`, `"adadelta"`, and `"adam"`.
`samples_train`	vectors of indices specifying the training, validation and test sets for training autoencoder.
`samples_valid`	vectors of indices specifying the training, validation and test sets for training autoencoder.
`samples_test`	vectors of indices specifying the training, validation and test sets for training autoencoder.
`pca`	logical. If TRUE, the principal components of features are returned. Default is TRUE.
`verbose`	logical. If TRUE, training progress is printed.
`return_theta`	logical. If TRUE, extracted features are returned.

This function trains a sequence-to-sequence autoencoder using keras. The encoder of the autoencoder consists of an embedding layer and a recurrent neural network. The decoder consists of another recurrent neural network and a fully connect layer with softmax activation. The outputs of the encoder are the extracted features.

The output of the encoder is a function of the encoder recurrent neural network. It is the last output of the encoder recurrent neural network if method="last" and the average of the encoder recurrent nenural network if method="avg".

aseq2feature_seq2seq returns a list containing

`theta`	a matrix containing `K` features or principal features. Each column is a feature.
`train_loss`	a vector of length `n_epoch` recording the trace of training losses.
`valid_loss`	a vector of length `n_epoch` recording the trace of validation losses.
`test_loss`	a vector of length `n_epoch` recording the trace of test losses. Exists only if `samples_test` is not `NULL`.

chooseK_seq2seq for choosing K through cross-validation.

Other feature extraction methods: atseq2feature_seq2seq, seq2feature_mds_large, seq2feature_mds, seq2feature_ngram, seq2feature_seq2seq, tseq2feature_seq2seq

if (!system("python -c 'import tensorflow as tf'", ignore.stdout = TRUE, ignore.stderr= TRUE)) {
  n <- 50
  seqs <- seq_gen(n)
  seq2seq_res <- aseq2feature_seq2seq(seqs$action_seqs, 5, rnn_type="lstm", n_epoch=5, 
                                   samples_train=1:40, samples_valid=41:50)
  features <- seq2seq_res$theta
  plot(seq2seq_res$train_loss, col="blue", type="l")
  lines(seq2seq_res$valid_loss, col="red")
}