In benpeloquin7/rrrsa: Run Rational Speech Act Models of Pragmatic Inference

Rational speech act model (RSA) of pragmatic inference

rrrsa is an R package for running RSA models -- Bayesian models of pragmatic inference. rrrsa was created by Ben Peloquin in collaboration with Michael C. Frank and has been optimized for analysis of experimental data such as those presented in Frank, et al. (Under Review) and Peloquin & Frank (2016). For other, more flexible variants of RSA models, please see http://forestdb.org/models/scalar-implicature.html.

Installation

You can install the latest version of rrrsa by installing devtools and running:

install.packages("devtools")
devtools::install_github("benpeloquin7/rrrsa")

What is RSA?

Rational speech act (RSA) models frame language understanding as a special case of social cognition in which speakers and listeners reason about one another recursively. A pragmatic listener $P_{L_n}(m|u)$, reasons about intended meaning $m$ of an utterance $u$ by a rational speaker $P_{s_n}(u|m)$ who chooses an utterance according to the expected utility of an utterance $U(m;u)$. $\alpha$ is a decision noise parameter.

$$P_{L_n}(m|u) \propto P_{S_n}(u|m)P(m)$$ $$P_{S_n} \propto e^{U(m;u)}$$ $$U(m;u) = -\alpha(-\log(P_{L_{n-1}}(m|u)) - C(u))$$

See Frank & Goodman (2012) and Goodman & Stuhmuller (2013) for the original descriptions of the RSA framework. Frank, et al. (Under Review) also provides a comprehensive presentation and evaluation of RSA.

rrrsa includes empirical data

Data from "Rational speech act models of pragmatic reasoning in reference games" (Frank, et al., Under Review) and "Determining the alternatives in scalar implicature" (Peloquin & Frank, 2016) are also included in this package. Examples using data from these studies are included below.

rrrsa includes access to all model components

rrrsa provides users with access to all model components. The following sections demonstrate how this functionality can be used.

library(knitr)
library(ggplot2)
library(tidyr)
library(dplyr)
library(purrr)
library(rrrsa)

Calculating the informativity of an utterance with `rsa.informativity()`

rsa.informativity() takes three arguments, literal semantics $P_{L_0}$, alpha level (default 1), and cost (default 0). This function returns the surprisal of an utterance minus cost, multiplied by alpha.

rsa.informativity(0.4)
rsa.informativity(rsa.informativity(0.4), alpha = 2, cost = 0.5)

Calculating the utility of an utterance with `rsa.utility()`

rsa.utility takes an input vector of literal listener semantics and outputs a normalized vector of speaker likelihoods. If costs are not specified the default 0's vector is used. If alpha is not specified a default value of $1$ is used.

literalSemantics <- c(0.0, 0.0, 0.3, 0.3, 0.4)
costs <- c(0.0, 0.0, 0.2, 0.3, 0.4)
rsa.utility(items = literalSemantics, costs = costs, alpha = 3)

Computing one full recursion with `rsa.fullRecursion()`

In the RSA framework one full recursion consists of a pragmatic listner $P_{L_1}$ who reasons about a rational speaker $P_{s_1}$ who reason about a literal listener $P_{L_0}$. Expected input is an $m$ matrix of $P_{L_0}$ literal listener values in which columns correspond to items (words) and rows correspond to semantic quantity (i.e. "stars" in Peloquin & Frank, 2016). Optional arguments include a costs vector which must be the same length as ncol and an optional priors vector which must be the same length as nrows. rsa.fullRecursion() provides safety checking for these cases. Output corresponds with pragmatic listener posterior predictions.

m <- matrix(data = c(1.0, 0.0, 0.0, 0.0, 0.0,
                     0.0, 0.25, 0.25, 0.25, 0.25,
                     0.0, 0.0, 0.0, 0.0, 1.0), nrow = 5)
colnames(m) <- c("none", "some", "all")
rownames(m) <- 1:5
# costs <- c("none" = 0, "some" = 0, "all" = 0)
# priors <- rnorm(n = nrow(m), mean = 0.5, sd = 0.1)
res <- rsa.fullRecursion(m = m)
res <- as.data.frame(res) %>%
  mutate(quantity = rownames(.))

## Prep data
pragmaticsTidied <- res %>%
  gather(word, pragmatics, -quantity)
semanticsTidied <- as.data.frame(m) %>%
  mutate(quantity = rownames(.)) %>%
  gather(word, semantics, c(none, some, all))
fullData <- merge(pragmaticsTidied, semanticsTidied) %>%
  gather(type, value, c(pragmatics, semantics)) %>%
  mutate(quantity = as.numeric(quantity),
         lineSize = ifelse(word == "some", 2, 1))

## Visualize implicature
ggplot(fullData, aes(x = quantity, y = value, col = word)) +
  geom_line(aes(size=lineSize)) +
  facet_wrap(~type) +
  theme_bw()

Running multiple recursions with `rsa.reason()`

rsa.reason() is a wrapper function for rsa.fullRecursion which provides an additional depth parameter specifying the recursive depth during reasoning. If depth is not provided, default value is $1$.

all(rsa.reason(m = m, depth = 2) == rsa.fullRecursion(rsa.fullRecursion(m = m)))
rsa.reason(m = m, depth = 2) %>%
  kable()

Running data frames with `rsa.runDf()`

Run RSA on a tidied data frame and avoid running individual model components individually with rsa.runDf(). This is the primary workhorse function of rrrsa. An rrrsa-ready, tidied data frame must contain columns for semantic quantity, item and semantics, where each row corresponds with unique item/quantity combination. A user should specify their naming convention for these items in the quantityVarName, itemVarName and semanticsVarName arguments. The costVarName and priorsVarName arguments correspond with costs and/or priors data. Users can specify values for alpha and depth hyperparamenters. rsa.runDf() will return a data frame with model predictions 'preds' appended as a new column.

## Hypothetical literal listener data we might want to use RSA to simulate implicature.
df <- data.frame(scales = rep("some_all", 15),
                  stars = as.factor(rep(1:5, 3)),
                  starsChar = as.factor(rep(c("one", "two", "three", "four", "five"), 3)),
                  words = c(rep("all", 5), rep("some", 5), rep("none", 5)),
                  listenerSemantics = c(rep(0.0, 4), 1.0,
                                0.0, rep(0.25, 4),
                                1.0, rep(0.0, 4)))
rsa.runDf(df,
          quantityVarName="stars",
          semanticsVarName="listenerSemantics",
          itemVarName="words") %>%
  kable()

Importantly, rsa.runDf() maintains all column naming and can handle multiple data types. For example, we can run rsa.runDf() with a character vector for quantity (contrast with the factor vector used above):

all(rsa.runDf(df, quantityVarName = "starsChar",
              semanticsVarName = "listenerSemantics",
              itemVarName = "words") ==
      rsa.runDf(df, quantityVarName = "stars",
                 semanticsVarName = "listenerSemantics",
                 itemVarName = "words"))

A frequent use case for RSA will require running RSA over multiple groups of data. Rather than subsetting data frames and running RSA iteratively, we recommend using map_df() from the purrr package. Here we split by the scales variable.

## RSA of literal semantics we'd like to run RSA on
df <- data.frame(scales = c(rep("some_all", 10), rep("good_excellent", 10)),
                  stars = as.factor(rep(1:5, 4)),
                  words = c(rep("all", 5), rep("some", 5), c(rep("excellent", 5), rep("good", 5))),
                  listenerSemantics = c(rep(0.0, 4), 1.0,
                                0.0, rep(0.25, 4),
                                rep(0.0, 4), 1.0,
                                0.0, rep(0.25, 4))) %>% mutate(priors = 0.20)
df$costs <- c(rep(3, 5), rep(4, 5), rep(9, 5), rep(4, 5))

## Using purrr::map_df() run rsa.runDf() over df subsets
df %>%
  split(.$scales) %>%
  map_df(~rsa.runDf(
    data=.x,
    quantityVarName="stars",
    semanticsVarName="listenerSemantics",
    itemVarName="words",
    costsVarName="costs",
    depth=2)) %>%
  head(n=10) %>%
  kable()

An example tuning hyperparameters using `rsa.runDf()` and `purrr::map_df()` with empirical data from Peloquin & Frank (2016)

Data description for Peloquin & Frank (2016) "Determining the alternatives for scalar implicature"

rrrsa includes empirical literal listener $P_{L_0}$ which can be used as input to rrrsa as well as $P_{L_1}$ pragmatic judgments for model tuning and comparison. Four data sets are included:

peloquinFrank_2Alts: data set with entailment alternatives

peloquinFrank_3Alts: data set with entailment alternatives + universal none

peloquinFrank_4Alts: data set with entailment alternatives + top two empirically derived alts

peloquinFrank_5Alts: data set with entailment alternatives + top two empirically derived alts + neutral valence alternative

str(peloquinFrank_2Alts) %>%
  kable()

See ?peloquinFrank_2alts for data descriptions and [link to CogSci paper here...]

RSA hyperparameters alpha and depth can be tuned using custom functions. Here we give a simple example of hyperparamter tuning with grid search using nested for loops. We examine empirical data from Peloquin & Frank (2016).

In this case our grouping variable is scales -- we run rsa.runDf() on each scale subset for each alpha and depth we're interested in saving the results to res.

## hyperparams
depths <- seq(0, 4)
alphas <- seq(0, 5, by=0.2)

## Grid search using nested for-loops
res <- c()
for (alpha in alphas) {
  for (depth in depths) {
    curr_out <- peloquinFrank_5Alts %>%
      split(.$scale) %>%
      map_df(~rsa.runDf(
        data=.x,
          quantityVarName="stars",
          semanticsVarName="speaker.p",
          itemVarName="words",
          depth=depth,
          alpha=alpha)) %>%
      mutate(alpha=alpha,
             depth=depth)
    res <- rbind(res, curr_out)
  }
}

In this case, we're only interested in how the model predicts some of the items (entailment items; see Peloquin & Frank, 2016).

target_items <- c("good", "excellent",
                  "liked", "loved",
                  "memorable", "unforgettable",
                  "palatable", "delicious",
                  "some", "all")

d_cors <- res %>%
  filter(words %in% target_items) %>%
  group_by(alpha, depth) %>%
  summarise(r=cor(e11, preds)) %>%
  ungroup

How does the model respond to the hyperparameters?

ggplot(d_cors, aes(x=alpha, y=r, col=as.factor(depth))) +
  geom_line(alpha=0.8) +
  theme_bw()

d_cors %>%
  summarise(max_r=max(r),
            max_alpha=alpha[which.max(r)],
            max_depth=depth[which.max(r)]) %>%
  kable()

Simulating pragmatic inference with data from "Rational speech act models of pragmatic reasoning in reference games" - Frank, et al. (Under Review)

Data description for Frank, et al. (Under Review)

rrrsa includes empirical data used in Frank et al. (Under Review) model simulations in the rrrsa::d_pragmods data frame. We provide users with access to the data and include a short example of running simulations.

head(d_pragmods) %>%
  kable()

You'll notice some NA values. This is because empirical data were not measured for all items, however we can supply the model with literal semantics for those items. These are present in the speaker.p column.

For more detailed information on the data included in d_pragmods run.

?d_pragmods

RSA simulations

Unlike in the previous example, we'd like to run rsa via rsa.runDf() over individual experiments rather than scales. We've provided a grouping variable in the data frame (grouper) which allows us to do this.

We split by the grouping variable grouper and use purrr::map_df() which allows us to run rsa.runDf() over subsets of the data frame. This equivalent to subsetting d_pragmods by each factor in grouper, running rsa.runDf and rbind()ing the results.

d_preds_priors <- d_pragmods %>%
  split(list(.$grouper)) %>%
  map_df(~rsa.runDf(
    data=.x,
    quantityVarName="object",
    semanticsVarName="speaker.p",
    itemVarName="query",
    priorsVarName = "priorValue",
    depth=1,
    alpha=1)) %>%
  mutate(keep_indices = ifelse(!is.na(count), "keep", "throwout")) %>%
  filter(keep_indices=="keep")

Notice that we're not interested in all the data here. We only collected empirical judgments for some of the items, but rrrsa requires that we include all the alternatives. We handle this by identifiying all the rows that contain NAs after running the simulations and removing them.

How does our model do?

ggplot(d_preds_priors, aes(x=preds, y=p, col=expt, pch=object)) +
  geom_pointrange(aes(ymin = cil, ymax = cih)) + 
  xlim(c(0,1)) + ylim(c(0,1)) + 
  ylab("Proportion choosing target") + 
  xlab("Model Predictions") + 
  geom_smooth(method="lm", aes(group=1)) + 
  geom_abline(slope = 1, intercept = 0, lty=2) +
  ggtitle(paste0("Priors model sims\nalpha=1, depth=1\n",
                 "cor=",
                 round(cor(d_preds_priors$p, d_preds_priors$preds), 2))) +
  theme_bw()

This data corresponds with the second row of Table 2 in Frank, et al. (Under Review).

We can re-run without including priors by simply omitting the priors column name, in this case priorValue.

d_preds_no_priors <- d_pragmods %>%
  split(list(.$grouper)) %>%
  map_df(~rsa.runDf(
    data=.x,
    quantityVarName="object",
    semanticsVarName="speaker.p",
    itemVarName="query",
    # priorsVarName = "priorValue",
    depth=1,
    alpha=1)) %>%
  mutate(keep_indices = ifelse(!is.na(count), "keep", "throwout")) %>%
  filter(keep_indices=="keep")

ggplot(d_preds_no_priors, aes(x=preds, y=p, col=expt, pch=object)) +
  geom_pointrange(aes(ymin = cil, ymax = cih)) + 
  xlim(c(0,1)) + ylim(c(0,1)) + 
  ylab("Proportion choosing target") + 
  xlab("Model Predictions") + 
  geom_smooth(method="lm", aes(group=1)) + 
  geom_abline(slope = 1, intercept = 0, lty=2) +
  ggtitle(paste0("No priors model sims\nalpha=1, depth=1\n",
                 "cor=",
                 round(cor(d_preds_no_priors$p, d_preds_no_priors$preds), 2))) +
  theme_bw()

This data corresponds with the second row of Table 4 in Frank, et al. (Under Review).

benpeloquin7/rrrsa documentation built on May 12, 2019, 2:08 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

benpeloquin7/rrrsa
Run Rational Speech Act Models of Pragmatic Inference

In benpeloquin7/rrrsa: Run Rational Speech Act Models of Pragmatic Inference

Rational speech act model (RSA) of pragmatic inference

Installation

What is RSA?

rrrsa includes empirical data

rrrsa includes access to all model components

Calculating the informativity of an utterance with `rsa.informativity()`

Calculating the utility of an utterance with `rsa.utility()`

Computing one full recursion with `rsa.fullRecursion()`

Running multiple recursions with `rsa.reason()`

Running data frames with `rsa.runDf()`

An example tuning hyperparameters using `rsa.runDf()` and `purrr::map_df()` with empirical data from Peloquin & Frank (2016)

Data description for Peloquin & Frank (2016) "Determining the alternatives for scalar implicature"

Simulating pragmatic inference with data from "Rational speech act models of pragmatic reasoning in reference games" - Frank, et al. (Under Review)

Data description for Frank, et al. (Under Review)

RSA simulations

R Package Documentation

Browse R Packages

We want your feedback!

benpeloquin7/rrrsa Run Rational Speech Act Models of Pragmatic Inference

In benpeloquin7/rrrsa: Run Rational Speech Act Models of Pragmatic Inference

Rational speech act model (RSA) of pragmatic inference

Installation

What is RSA?

rrrsa includes empirical data

rrrsa includes access to all model components

Calculating the informativity of an utterance with rsa.informativity()

Calculating the utility of an utterance with rsa.utility()

Computing one full recursion with rsa.fullRecursion()

Running multiple recursions with rsa.reason()

Running data frames with rsa.runDf()

An example tuning hyperparameters using rsa.runDf() and purrr::map_df() with empirical data from Peloquin & Frank (2016)

Data description for Peloquin & Frank (2016) "Determining the alternatives for scalar implicature"

Simulating pragmatic inference with data from "Rational speech act models of pragmatic reasoning in reference games" - Frank, et al. (Under Review)

Data description for Frank, et al. (Under Review)

RSA simulations

R Package Documentation

Browse R Packages

We want your feedback!

benpeloquin7/rrrsa
Run Rational Speech Act Models of Pragmatic Inference

Calculating the informativity of an utterance with `rsa.informativity()`

Calculating the utility of an utterance with `rsa.utility()`

Computing one full recursion with `rsa.fullRecursion()`

Running multiple recursions with `rsa.reason()`

Running data frames with `rsa.runDf()`

An example tuning hyperparameters using `rsa.runDf()` and `purrr::map_df()` with empirical data from Peloquin & Frank (2016)