seqm: Fitting sequence models
In ProcData: Process Data Analysis

Description Usage Arguments Details Value See Also Examples

seqm is used to fit a neural network model relating a response process with a variable.

seqm(seqs, response, covariates = NULL, response_type,
  actions = unique(unlist(seqs$action_seqs)), rnn_type = "lstm",
  include_time = FALSE, time_interval = TRUE, log_time = TRUE,
  K_emb = 20, K_rnn = 20, n_hidden = 0, K_hidden = NULL,
  index_valid = 0.2, verbose = FALSE, max_len = NULL, n_epoch = 20,
  batch_size = 16, optimizer_name = "rmsprop", step_size = 0.001)

`seqs`	an object of class `"proc"`.
`response`	response variable.
`covariates`	covariate matrix.
`response_type`	"binary" or "scale".
`actions`	a character vector gives all possible actions. It is will be expanded to include all actions appear in `seqs` if necessary.
`rnn_type`	the type of recurrent unit to be used for modeling response processes. `"lstm"` for the long-short term memory unit. `"gru"` for the gated recurrent unit.
`include_time`	logical. If the timestamp sequence should be included in the model.
`time_interval`	logical. If the timestamp sequence is included as a sequence of inter-arrival time.
`log_time`	logical. If take the logarithm of the time sequence.
`K_emb`	the latent dimension of the embedding layer.
`K_rnn`	the latent dimension of the recurrent neural network.
`n_hidden`	the number of hidden fully-connected layers.
`K_hidden`	a vector of length `n_hidden` specifying the number of nodes in each hidden layer.
`index_valid`	proportion of sequences used as the validation set or a vector of indices specifying the validation set.
`verbose`	logical. If TRUE, training progress is printed.
`max_len`	the maximum length of response processes.
`n_epoch`	the number of training epochs.
`batch_size`	the batch size used in training.
`optimizer_name`	a character string specifying the optimizer to be used for training. Availabel options are `"sgd"`, `"rmsprop"`, `"adadelta"`, and `"adam"`.
`step_size`	the learning rate of optimizer.

The model consists of an embedding layer, a recurrent layer and one or more fully connected layers. The embedding layer takes an action sequence and output a sequences of K dimensional numeric vectors to the recurrent layer. If include_time = TRUE, the embedding sequence is combined with the timestamp sequence in the response process as the input the recurrent layer. The last output of the recurrent layer and the covariates specified in covariates are used as the input of the subsequent fully connected layer. If response_type="binary", the last layer uses the sigmoid activation to produce the probability of the response being one. If response_type="scale", the last layer uses the linear activation. The dimension of the output of other fully connected layers (if any) is specified by K_hidden.

The action sequences are re-coded into integer sequences and are padded with zeros to length max_len before feeding into the model. If the provided max_len is smaller than the length of the longest sequence in seqs, it will be overridden.

seqm returns an object of class "seqm", which is a list containing

`structure`	a string describing the neural network structure.
`coefficients`	a list of fitted coefficients. The length of the list is 6 + 2 * `n_hidden`. The first element gives the action embedding. Elements 2-4 are parameters in the recurrent unit. The rest of the elements are for the fully connected layers. Elements 4 + (2 * i - 1) and 4 + 2 * i give the parameters for the i-th fully connected layer.
`model_fit`	a vector of class `"raw"`. It is the serialized version of the trained keras model.
`feature_model`	a vector of class `"raw"`. It is the serialized version of the keras model for obtaining the rnn outputs.
`include_time`	if the timestamp sequence is included in the model.
`time_interval`	if inter-arrival time is used.
`log_time`	if the logarithm time is used.
`actions`	all possible actions.
`max_len`	the maximum length of action sequences.
`history`	a `n_epoch` by 2 matrix giving the training and validation losses at the end of each epoch.

predict.seqm for the predict method for seqm objects.

if (!system("python -c 'import tensorflow as tf'", ignore.stdout = TRUE, ignore.stderr= TRUE)) {
  n <- 100
  data(cc_data)
  samples <- sample(1:length(cc_data$responses), n)
  seqs <- sub_seqs(cc_data$seqs, samples)

  y <- cc_data$responses[samples]
  x <- matrix(rnorm(n*2), ncol=2)

  index_test <- 91:100
  index_train <- 1:90
  seqs_train <- sub_seqs(seqs, index_train)
  seqs_test <- sub_seqs(seqs, index_test)

  actions <- unique(unlist(seqs$action_seqs))

  ## no covariate is used
  res1 <- seqm(seqs = seqs_train, response = y[index_train], 
               response_type = "binary", actions=actions, K_emb = 5, K_rnn = 5, 
               n_epoch = 5)
  pred_res1 <- predict(res1, new_seqs = seqs_test)

  mean(as.numeric(pred_res1 > 0.5) == y[index_test])

  ## add more fully connected layers after the recurrent layer.
  res2 <- seqm(seqs = seqs_train, response = y[index_train],
               response_type = "binary", actions=actions, K_emb = 5, K_rnn = 5, 
               n_hidden=2, K_hidden=c(10,5), n_epoch = 5)
  pred_res2 <- predict(res2, new_seqs = seqs_test)
  mean(as.numeric(pred_res2 > 0.5) == y[index_test])

  ## add covariates
  res3 <- seqm(seqs = seqs_train, response = y[index_train], 
               covariates = x[index_train, ],
               response_type = "binary", actions=actions, 
               K_emb = 5, K_rnn = 5, n_epoch = 5)
  pred_res3 <- predict(res3, new_seqs = seqs_test, 
                       new_covariates=x[index_test, ])
                     
  ## include time sequences
  res4 <- seqm(seqs = seqs_train, response = y[index_train], 
               response_type = "binary", actions=actions,
               include_time=TRUE, K_emb=5, K_rnn=5, n_epoch=5)
  pred_res4 <- predict(res4, new_seqs = seqs_test)
}