get_tokenized_text: Tokenize text

Description Usage Arguments Value Examples

View source: R/API.R

Description

Separate words in a text using space characters

Usage

1
get_tokenized_text(model, texts)

Arguments

model

fastText model

texts

a character containing the documents

Value

a list of character containing words

Examples

1
2
3
4
5
6
7
library(fastrtext)
model_test_path <- system.file("extdata", "model_unsupervised_test.bin", package = "fastrtext")
model <- load_model(model_test_path)
tokens <- get_tokenized_text(model, "this is a test")
print(tokens)
tokens <- get_tokenized_text(model, c("this is a test 1", "this is a second test!"))
print(tokens)

Example output

[[1]]
[1] "this" "is"   "a"    "test"

[[1]]
[1] "this" "is"   "a"    "test" "1"   

[[2]]
[1] "this"   "is"     "a"      "second" "test!" 

fastrtext documentation built on Oct. 30, 2019, 11:32 a.m.