Pre-Trained Models

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The text-package allows you to create pre-trained models using the textTrain() functions. The models can be saved and used on new data using the textPredict() function. The table below shows pre-trained models that are available to download. The models can be called with textPredict() like this:

library(text)

# Example calling a model using the URL
textPredict(
  model_info = "https://github.com/OscarKjell/text_models/raw/main/valence_models/facebook_model.rds",
  texts = "what is the valence of this text?"
)


# Example calling a model having an abbreviation
textPredict(
  model_info = implicit_power_roberta_large_L23_v1,
  texts = "It looks like they have problems collaborating."
)

The textPredict() function can be given a model and a text, and automatically transform the text to word embeddings and produce estimated scores or probabilities.

If you want to add a pre-trained model to the table, please fill out the details in this Google sheet and email us (oscar [ d_o t] kjell [a _ t] psy [DOT] lu [d_o_t]se) so that we can update the table online.

Note that you can adjust the width of the columns when scrolling the table.

library("reactable")
# see vignette: https://glin.github.io/reactable/articles/examples.html#custom-rendering

model_data <- read.csv(system.file("extdata",
                                   "text_models_info.csv",
                                   package = "text"),
                       header = TRUE, 
                       skip = 2)

reactable::reactable(
  data = model_data,
  filterable = TRUE,
  defaultPageSize = 10,
  highlight = TRUE, 
  resizable = TRUE,
  theme = reactableTheme(
    borderColor = "#1f7a1f",
  #  stripedColor = "#e6ffe6",
    highlightColor = "#ebfaeb",
    cellPadding = "8px 12px",
    style = list(fontFamily = "-apple-system, BlinkMacSystemFont, Segoe UI, Helvetica, Arial, sans-serif")
  ),
  columns = list(
    Construct = colDef(minWidth = 280),
    Outcome = colDef(minWidth = 200),
    Language.type = colDef(minWidth = 200),
    Name = colDef(minWidth = 350),
    Path = colDef(minWidth = 300),
    Model.type = colDef(minWidth = 200),
    Feature  = colDef(minWidth = 200),
    CV.accuracy = colDef(minWidth = 150),
    Held.out.accuracy = colDef(minWidth = 150),
    SEMP.accuracy = colDef(minWidth = 150),
    Reference = colDef(minWidth = 250),
    Description = colDef(minWidth = 200),
    N.training = colDef(minWidth = 200),
    Label.types = colDef(minWidth = 200),
    Other = colDef(minWidth = 200),
    Command.info = colDef(minWidth = 800)
  ), 
  showPageSizeOptions = TRUE,
  groupBy = "Construct"
)

References

Gu, Kjell, Schwartz & Kjell. (2024). Natural Language Response Formats for Assessing Depression and Worry with Large Language Models: A Sequential Evaluation with Model Pre-registration.

Kjell, O. N., Sikström, S., Kjell, K., & Schwartz, H. A. (2022). Natural language analyzed with AI-based transformers predict traditional subjective well-being measures approaching the theoretical upper limits in accuracy. Scientific reports, 12(1), 3918.

Nilsson, Runge, Ganesan, Lövenstierne, Soni & Kjell (2024) Automatic Implicit Motives Codings are at Least as Accurate as Humans’ and 99% Faster



Try the text package in your browser

Any scripts or data that you put into this service are public.

text documentation built on Sept. 11, 2024, 7:22 p.m.