sentencepiece_decode: Decode encoded sequences back to text

View source: R/sentencepiece.R

sentencepiece_decodeR Documentation

Decode encoded sequences back to text

Description

Decode a sequence of Sentencepiece ids into text again

Usage

sentencepiece_decode(model, x)

Arguments

model

an object of class sentencepiece as returned by sentencepiece_load_model or sentencepiece

x

an integer vector of Sentencepiece id's or a list of these

Value

a character vector of detokenised text or if you encoded with nbest, a list of these

Examples

model <- system.file(package = "sentencepiece", "models", "nl-fr-dekamer.model")
model <- sentencepiece_load_model(file = model)

txt <- c("De eigendomsoverdracht aan de deelstaten is ingewikkeld.",
         "On est d'accord sur le prix de la biere?")
       
x <- sentencepiece_encode(model, x = txt, type = "subwords")
sentencepiece_decode(model, x)
x <- sentencepiece_encode(model, x = txt, type = "ids")
sentencepiece_decode(model, x)

model <- system.file(package = "sentencepiece", "models", 
                     "nl-fr-dekamer-unigram.model")
model <- sentencepiece_load_model(file = model)
x <- sentencepiece_encode(model, x = txt, type = "subwords", nbest = 3)
sentencepiece_decode(model, x)
x <- sentencepiece_encode(model, x = txt, type = "subwords", 
                          nbest = 3, alpha = 0.1)
sentencepiece_decode(model, x)

sentencepiece documentation built on Nov. 13, 2022, 5:05 p.m.