Description Usage Arguments Note Examples
This function trains the LSTM model to identify the ideological slant of Tweets.
1 2 | train_lstm(X_train, y_train, embeddings = "w2v", embedding_dim = 25,
bidirectional = FALSE, convolutional = FALSE)
|
X_train |
data.frame or matrix of vectorized Tweets |
y_train |
Labels for training data. 0 for liberal, 1 for conservative. |
embeddings |
Type of word embedding algorithm to use. Options are "w2v" (word2vec), "glove", or "random" (random initialization). |
embedding_dim |
Length of word embeddings to use. Options are 25, 50, 100, or 200. |
bidirectional |
Optionally train on text sequences in reverse as well as forwards. |
convolutional |
Optionally apply convolutional filter to text sequences. Can only be used when bidirectional = TRUE |
Models are automatically saved in HDF5 format to a sub-folder of the root-directory called "models". File format is "{model type}_{embedding type}_{embedding dimensionality}d.h5".
1 2 3 4 5 6 7 8 9 10 11 | # train a Bi-LSTM network using GloVe embeddings
data("ideo_tweets")
ideo_tokenizer <- text_tokenizer(num_words=20000)
ideo_tokenizer <- fit_text_tokenizer(ideo_tokenizer, ideo_tweets$text)
texts <- texts_to_vectors(ideo_tweets$text, ideo_tokenizer)
labels <- tweets$ideo_cat
train_test <- train_test_split(texts, labels)
X_train <- train_test$X_train
y_trian <- train_test$y_train
train_ltsm(X_train, ty_train, embeddings="glove", bidirectional=TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.