predict.Transformer: Predict alongside a Transformer model

Description Usage Arguments Value Examples

View source: R/embed.R

Description

Extract features from the Transformer model namely get

Usage

1
2
3
4
5
6
7
8
## S3 method for class 'Transformer'
predict(
  object,
  newdata,
  type = c("embed-sentence", "embed-token", "tokenise"),
  trace = 10,
  ...
)

Arguments

object

an object of class Transformer as returned by transformer

newdata

a data.frame with columns doc_id and text indicating the text to embed

type

a character string, either 'embed-sentence', 'embed-token', 'tokenise' to get respectively sentence-level embeddings, token-level embeddings or the wordpiece tokens

trace

logical indicating to show a trace of the progress. Defaults to showing every 10 annotated embeddings

...

other arguments passed on to the methods

Value

depending on the argument type the function returns:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
transformer_download_model("bert-base-multilingual-uncased")
model <- transformer("bert-base-multilingual-uncased")

x <- data.frame(doc_id = c("doc_1", "doc_2"),
                text = c("provide some words to embed", "another sentence of text"),
                stringsAsFactors = FALSE)
predict(model, x, type = "tokenise")
embedding <- predict(model, x, type = "embed-sentence")
dim(embedding)
embedding <- predict(model, x, type = "embed-token")
str(embedding)


unlink(file.path(system.file(package = "golgotha", "models"),
       "bert-base-multilingual-uncased"), recursive = TRUE)

bnosac/golgotha documentation built on May 28, 2020, 4:06 a.m.