text_unmask | R Documentation |
Note: This function has been deprecated and will not be updated since I have developed new package FMAT as the integrative toolbox of Fill-Mask Association Test (FMAT).
Predict the probably correct masked token(s) in a sequence,
based on the Python module transformers
.
text_unmask(query, model, targets = NULL, topn = 5)
query |
A query (sentence/prompt) with masked token(s) |
model |
Model name at HuggingFace.
See |
targets |
Specific target word(s) to be filled in the blank |
topn |
Number of the most likely predictions to return.
Defaults to |
Masked language modeling is the task of masking some of the words in a sentence and predicting which words should replace those masks. These models are useful when we want to get a statistical understanding of the language in which the model is trained in. See https://huggingface.co/tasks/fill-mask for details.
A data.table
of query results:
query_id
(if there are more than one query
)query
ID (indicating multiple queries)
mask_id
(if there are more than one [MASK]
in query
)[MASK]
ID (position in sequence, indicating multiple masks)
prob
Probability of the predicted token in the sequence
token_id
Predicted token ID (to replace [MASK]
)
token
Predicted token (to replace [MASK]
)
sequence
Complete sentence with the predicted token
text_init
text_model_download
text_model_remove
text_to_vec
## Not run:
# text_init() # initialize the environment
model = "distilbert-base-cased"
text_unmask("Beijing is the [MASK] of China.", model)
# multiple [MASK]s:
text_unmask("Beijing is the [MASK] [MASK] of China.", model)
# multiple queries:
text_unmask(c("The man worked as a [MASK].",
"The woman worked as a [MASK]."),
model)
# specific targets:
text_unmask("The [MASK] worked as a nurse.", model,
targets=c("man", "woman"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.