predict_readability: predict readability score from a fitted BT model
In kbenoit/sophistication: Functions to help measure textual sophistication

Description Usage Arguments Value Examples

Predicts the lambda for a given text in newdata, from a fitted Bradley-Terry model object object.

predict_readability(
  object,
  newdata = NULL,
  reference_top = -2.1763368548,
  reference_bottom = -3.865467,
  bootstrap_n = 0,
  baseline_year = 2000,
  verbose = FALSE
)

`object`	a fitted `BradleyTerry2::BTm()` model object
`newdata`	a character or corpus object containing the texts whose readability values will be predicted. If omitted, the fitted values from `object` are used.
`reference_top, reference_bottom`	the lambda values of a text against which each predicted text will be compared for difficulty or rescaled. The default value for `reference_bottom` is the lambda applied to all of `data_corpus_fifthgrade()`, and is used as the baseline value to calculate the probability that a given text is easier, as well as the anchor value of 100 to which texts are rescaled. The default value for `reference_top` is the lambda of the most difficult text in the State of the Union corpus, and is used as the anchor value of 0 to which texts are rescaled (See `f999866.csv`.)
`bootstrap_n`	number of bootstrap replicates for computing intervals
`baseline_year`	a scalar or vector of the baseline years to choose for reference: a year ending in 0 from 1790-2000
`verbose`	logical; if `TRUE` print status messages

a data.frame with the rows named to the text names, and the columns consisting of:

lambda: estimated lambda for each text
prob: the probability that the text is easier than the reference lambda, the default of which is lambda applied to all of data_corpus_fifthgrade()
scaled: a rescaled lambda on a scale of "ease" ranging from 0-100, where 100 and 0 are determined by the fifth grade texts and the hardest text from the State of the Union corpus, respectively, unless specified by the user

## Not run: 
head(predict_readability(data_BTm_bms))
##           lambda       prob   scaled
## 100014 -3.296731 0.24593816 60.51612
## 100028 -3.190470 0.26617180 64.26088
## 100029 -3.719532 0.17607128 45.61617
## 100033 -4.703668 0.07396423 10.93416
## 100034 -3.289739 0.24723716 60.76252
## 100045 -2.780185 0.35346383 78.71976

txts <- c(fifthgrade = paste(as.character(data_corpus_fifthgrade), collapse = "  "),
          data_corpus_inaugural[c(1:2, 9:10, 54:58)])
predict_readability(data_BTm_bms, newdata = txts)
##                    lambda       prob    scaled
## fifthgrade      -2.128336 0.51199792 102.84175
## 1789-Washington -5.494969 0.03493749 -96.46991
## 1793-Washington -2.852801 0.33705102  59.95195
## 1821-Monroe     -3.629638 0.18949402  13.96156
## 1825-Adams      -4.138627 0.12321942 -16.17163
## 2001-Bush       -2.273380 0.47575815  94.25482
## 2005-Bush       -2.583155 0.39967525  75.91551
## 2009-Obama      -2.529601 0.41259115  79.08604
## 2013-Obama      -2.747889 0.36087883  66.16295
## 2017-Trump      -2.359702 0.45428669  89.14440

years <- c(2000, as.integer(substring(names(txts)[-1], 1, 4)))
predict_readability(data_BTm_bms, newdata = txts, baseline_year = years)
##                    lambda       prob    scaled
## fifthgrade      -2.128338 0.51199736 102.84162
## 1789-Washington -5.494972 0.03493741 -96.47004
## 1793-Washington -2.852803 0.33705052  59.95182
## 1821-Monroe     -3.629640 0.18949368  13.96143
## 1825-Adams      -4.138629 0.12321918 -16.17177
## 2001-Bush       -2.273383 0.47575759  94.25469
## 2005-Bush       -2.583158 0.39967471  75.91538
## 2009-Obama      -2.529603 0.41259061  79.08591
## 2013-Obama      -2.747891 0.36087832  66.16282
## 2017-Trump      -2.359704 0.45428614  89.14426

names(txts) <- gsub("ington", "", names(txts))
pr <- predict_readability(data_BTm_bms, newdata = txts[c(1:3, 9:10)], bootstrap_n = 100)
format(pr, digits = 4)
##            lambda    prob scaled lambda_lo lambda_hi  prob_lo prob_hi scaled_lo scaled_hi
## fifthgrade -2.172 0.50105 100.15    -2.210    -2.135 0.491591 0.51032     98.81   101.455
## 1789-Wash  -5.676 0.02931 -23.35    -6.917    -4.870 0.008664 0.06336    -67.06     5.076
## 1793-Wash  -3.560 0.20036  51.22    -4.524    -2.609 0.087218 0.39358     17.25    84.765
## 2013-Obama -2.791 0.35107  78.35    -2.914    -2.645 0.323485 0.38483     74.00    83.469
## 2017-Trump -2.381 0.44904  92.79    -2.511    -2.213 0.417178 0.49080     88.22    98.704

predict_readability(data_BTm_bms, newdata = "The cat in the hat ate green eggs and ham.")
##      lambda      prob   scaled
## 1 -1.125721 0.7408932 137.0248


## End(Not run)