btfit: Fits the Bradley-Terry model
In BradleyTerryScalable: Fits the Bradley-Terry Model to Potentially Large and Sparse Networks of Comparison Data

Description Usage Arguments Details Value Author(s) References See Also Examples

btfit fits the Bradley-Terry model on (potentially) large and sparse datasets.

1 2	btfit(btdata, a, MAP_by_component = FALSE, subset = NULL, maxit = 10000, epsilon = 0.001)

`btdata`	An object of class "btdata", typically the result ob of ob <- btdata(..). See `btdata`.
`a`	Must be >= 1. When `a = 1`, the function returns the maximum likelihood estimate (MLE) of π (by component, if necessary). When `a > 1`, `a` is the shape parameter for the Gamma prior. See Details.
`MAP_by_component`	Logical. Only considered if a > 1. Then, if FALSE, the MAP estimate will be found on the full dataset. If TRUE, the MAP estimate will be found separately for each fully-connected component.
`subset`	A condition for selecting a subset of the components. This can either be a character vector of names of the components, a single predicate function (that takes a component as its argument), or a logical vector of the same length as the number of components).
`maxit`	The maximum number of iterations for the algorithm. If returning π by component, this will be the maximum number of iterations for each component.
`epsilon`	Determines when the algorithm is deemed to have converged. (See Details.)

Let there be K items, let π_k be the Bradley-Terry strength parameter of item k, for k = 1, …, K and let π be the vector of all the π_k. Let w_{ij} be the number of times item i wins against item j, let n_{ij} = w_{ij} + w_{ji} be the number of times they play, with w_{ii} = 0 by convention and let W_i = ∑_{j=1}^K w_{ij}. Then the Bradley-Terry model states that the probability of item i beating item j, p_{ij}, is:

p_{ij} = \frac{π_i}{π_i + π_j}.

The comparison graph, G_W, has the K players as the nodes and a directed edge from node i to node j whenever item i has beaten item j at least once. The MLE of the Bradley-Terry model exists and is finite if and only if the comparison graph is fully-connected (i.e. if there is a directed path from node i to node j for all items i and j).

Assuming that the comparison graph of the data is fully-connected, the MLE of the Bradley-Terry model can be found using the MM-algorithm (Hunter, 2004).

If the comparison graph of the data is not fully-connected, there are two principled options for fitting the Bradley-Terry model. One is to find the MLE within each fully-connected component. The other is to find the Bayesian MAP estimate, as suggested by Caron & Doucet (2012), where a Gamma(a,b) gamma prior is placed on each π_i, and the product of these is taken as a prior on π. The MAP estimate can then be found with an EM-algorithm. When a = 1 and b = 0, the EM and MM-algorithms are equivalent and the MAP estimate and MLE are identical. The rate parameter of the Gamma prior, b, is not likelihood identifiable. When a > 1, b is set to aK - 1, where K is the number of items in the component; this choice of b minimises the number of iterations needed for the algorithm to converge.

The likelihood equations give

a - 1 + W_i = bπ_i + ∑_{j \neq i} \frac{n_{ij}π_i}{π_i + π_j},

for i = 1, …, K. For the algorithm to have converged, we want π to be such that the LHS and RHS of this equation are close for all i. Therefore, we set the convergence criteria as

≤ft|\frac{a - 1 + W_i}{bπ_i + ∑_{j \neq i} \frac{n_{ij}π_i}{π_i + π_j}} - 1\right| < ε,

for all i.

Since the equations do not typeset well within the R help window, we recommend reading this section online: https://ellakaye.github.io/BradleyTerryScalable/reference/btfit.html.

btfit returns an S3 object of class "btfit". It is a list containing the following components:

`call`	The matched call
`pi`	A list of length M, where M is the number of fully-connected components of the comparison graph G_W (or the requested subset) of two or more items. The m-th list item is a named vector π, the strength parameter, for the items in the m-th fully connected component, m = 1, …, M. These are sorted in descending order.
`iters`	A vector of length M. The m-th entry is the number of iterations it took for the algorithm to converge for the m-th component, for m = 1, …, M. Note that if the algorithm has not converged in any component, a warning will be produced.
`converged`	A logical vector of length M, indicating whether the algorithm has converged for the m-th component in `maxit` iterations.
`N`	A list of length M. The m-th list item is a matrix where each element n_{ij} is the number of times item i played against item j, for the items in the m-th component. The rows and columns are arranged in the same order as the ordered pi vector(s).
`diagonal`	A list of length M. The m-th item is a vector of the diagonal elements of the `btdata$wins` matrix, for the items in the m-th fully-connected component. These values are used as the fitted values for the diagonal of the matrix output in `fitted.btfit`.
`names_dimnames`	The names of the dimnames of the original `btdata$wins` matrix.

Ella Kaye, David Firth

Caron, F. and Doucet, A. (2012) Efficient Bayesian Inference for Generalized Bradley-Terry Models. Journal of Computational and Graphical Statistics, 21(1), 174-196.

Hunter, D. R. (2004) MM Algorithms for Generalized Bradley-Terry Models. The Annals of Statistics, 32(1), 384-406.

btdata, summary.btfit, coef.btfit, fitted.btfit, btprob, vcov.btfit, simulate.btfit

citations_btdata <- btdata(BradleyTerryScalable::citations)
fit1 <- btfit(citations_btdata, 1)
summary(fit1)
toy_df_4col <- codes_to_counts(BradleyTerryScalable::toy_data, c("W1", "W2", "D"))
toy_btdata <- btdata(toy_df_4col)
fit2a <- btfit(toy_btdata, 1)
summary(fit2a)
fit2b <- btfit(toy_btdata, 1.1)
summary(fit2b)
fit2c <- btfit(toy_btdata, 1, subset = function(x) length(x) > 3)
summary(fit2c)

$call
btfit(btdata = citations_btdata, a = 1)

$item_summary
# A tibble: 4 x 3
  component    item         estimate
  <chr>        <chr>           <dbl>
1 full_dataset JRSS-B          1.06 
2 full_dataset Biometrika      0.790
3 full_dataset JASA            0.310
4 full_dataset Comm Statist   -2.16 

$component_summary
# A tibble: 1 x 4
  component    num_items iters converged
  <chr>            <int> <int> <lgl>    
1 full_dataset         4     2 TRUE     

$call
btfit(btdata = toy_btdata, a = 1)

$item_summary
# A tibble: 7 x 3
  component item  estimate
  <chr>     <chr>    <dbl>
1 2         Han     0.696 
2 2         Gal     0.413 
3 2         Fin    -1.11  
4 3         Cyd     0.592 
5 3         Amy     0.0325
6 3         Ben    -0.243 
7 3         Dan    -0.382 

$component_summary
# A tibble: 2 x 4
  component num_items iters converged
  <chr>         <int> <int> <lgl>    
1 2                 3     6 TRUE     
2 3                 4    10 TRUE     

$call
btfit(btdata = toy_btdata, a = 1.1)

$item_summary
# A tibble: 8 x 3
  component    item  estimate
  <chr>        <chr>    <dbl>
1 full_dataset Eve     1.90  
2 full_dataset Cyd     0.472 
3 full_dataset Han     0.245 
4 full_dataset Amy    -0.0766
5 full_dataset Gal    -0.102 
6 full_dataset Ben    -0.423 
7 full_dataset Dan    -0.536 
8 full_dataset Fin    -1.48  

$component_summary
# A tibble: 1 x 4
  component    num_items iters converged
  <chr>            <int> <int> <lgl>    
1 full_dataset         8   101 TRUE     

$call
btfit(btdata = toy_btdata, a = 1, subset = function(x) length(x) > 
    3)

$item_summary
# A tibble: 4 x 3
  component item  estimate
  <chr>     <chr>    <dbl>
1 3         Cyd     0.592 
2 3         Amy     0.0325
3 3         Ben    -0.243 
4 3         Dan    -0.382 

$component_summary
# A tibble: 1 x 4
  component num_items iters converged
  <chr>         <int> <int> <lgl>    
1 3                 4    10 TRUE