fit_model: Deep learning model fitting

Description Usage Arguments Value Author(s) See Also Examples

View source: R/ttgsea.R

Description

From the result of GSEA, we can predict enrichment scores for unique tokens or words from text in names of gene sets by using deep learning. The function "text_token" is used for tokenizing text and the function "token_vector" is used for encoding. Then the encoded sequence is fed to the embedding layer of the model.

Usage

1
2
3
fit_model(gseaRes, text, score, model, ngram_min = 1, ngram_max = 2,
          num_tokens, length_seq, epochs, batch_size,
          use_generator = TRUE, ...)

Arguments

gseaRes

a table with GSEA result having rows for gene sets and columns for text and scores

text

column name for text data

score

column name for enrichment score

model

deep learning model, input dimension and length for the embedding layer must be same to the "num_token" and "length_seq", respectively

ngram_min

minimum size of an n-gram (default: 1)

ngram_max

maximum size of an n-gram (default: 2)

num_tokens

maximum number of tokens, it must be equal to the input dimension of "layer_embedding" in the "model"

length_seq

length of input sequences, it must be equal to the input length of "layer_embedding" in the "model"

epochs

number of epochs

batch_size

batch size

use_generator

if "use_generator" is TRUE, the function "sampling_generator" is used for "fit_generator". Otherwise, the "fit" is used without a generator.

...

additional parameters for the "fit" or "fit_generator"

Value

model

trained model

tokens

information for tokens

token_pred

prediction for every token, each row has a token and its predicted score

token_gsea

list of the GSEA result only for the corresponding token

num_tokens

maximum number of tokens

length_seq

length of input sequences

Author(s)

Dongmin Jung

See Also

keras::fit_generator, keras::layer_embedding, keras::pad_sequences, textstem::lemmatize_strings, text2vec::create_vocabulary, text2vec::prune_vocabulary

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
library(reticulate)
if (keras::is_keras_available() & reticulate::py_available()) {
  library(fgsea)
  data(examplePathways)
  data(exampleRanks)
  names(examplePathways) <- gsub("_", " ",
                            substr(names(examplePathways), 9, 1000))
  set.seed(1)
  fgseaRes <- fgsea(examplePathways, exampleRanks)
  
  num_tokens <- 1000
  length_seq <- 30
  batch_size <- 32
  embedding_dims <- 50
  num_units <- 32
  epochs <- 1
  
  ttgseaRes <- fit_model(fgseaRes, "pathway", "NES",
                         model = bi_gru(num_tokens,
                                        embedding_dims,
                                        length_seq,
                                        num_units),
                         num_tokens = num_tokens,
                         length_seq = length_seq,
                         epochs = epochs,
                         batch_size = batch_size,
                         use_generator = FALSE)
}

dongminjung/ttgsea documentation built on Dec. 30, 2021, 8:51 a.m.