View source: R/generate_initial_ranking.R
generate_initial_ranking | R Documentation |
Given a consistent set of pairwise preferences, generate a complete ranking of items which is consistent with the preferences.
generate_initial_ranking(
tc,
n_items = max(tc[, c("bottom_item", "top_item")]),
cl = NULL,
shuffle_unranked = FALSE,
random = FALSE,
random_limit = 8L
)
tc |
A dataframe with pairwise comparisons of |
n_items |
The total number of items. If not provided, it is assumed to
equal the largest item index found in |
cl |
Optional computing cluster used for parallelization, returned from
|
shuffle_unranked |
Logical specifying whether or not to randomly
permuted unranked items in the initial ranking. When
|
random |
Logical specifying whether or not to use a random initial
ranking. Defaults to |
random_limit |
Integer specifying the maximum number of items allowed
when all possible orderings are computed, i.e., when |
A matrix of rankings which can be given in the rankings
argument to compute_mallows
.
Setting random=TRUE
means that all possible orderings of each
assessor's preferences are generated, and one of them is picked at random.
This can be useful when experiencing convergence issues, e.g., if the MCMC
algorithm does not mixed properly. However, finding all possible orderings
is a combinatorial problem, which may be computationally very hard. The
result may not even be possible to fit in memory, which may cause the R
session to crash. When using this option, please try to increase the size
of the problem incrementally, by starting with smaller subsets of the
complete data. An example is given below.
As detailed in the documentation to generate_transitive_closure
,
it is assumed that the items are labeled starting from 1. For example, if a single
comparison of the following form is provided, it is assumed that there is a total
of 30 items (n_items=30
), and the initial ranking is a permutation of these 30
items consistent with the preference 29<30.
assessor | bottom_item | top_item |
1 | 29 | 30 |
If in reality there are only two items, they should be relabeled to 1 and 2, as follows:
assessor | bottom_item | top_item |
1 | 1 | 2 |
Other preprocessing:
estimate_partition_function()
,
generate_constraints()
,
generate_transitive_closure()
,
obs_freq
,
prepare_partition_function()
# The example dataset beach_preferences contains pairwise preferences of beach.
# We must first generate the transitive closure
beach_tc <- generate_transitive_closure(beach_preferences)
# Next, we generate an initial ranking
beach_init <- generate_initial_ranking(beach_tc)
# Look at the first few rows:
head(beach_init)
# We can add more informative column names in order
# to get nicer posterior plots:
colnames(beach_init) <- paste("Beach", seq(from = 1, to = ncol(beach_init), by = 1))
head(beach_init)
# By default, the algorithm for generating the initial ranking is deterministic.
# It is possible to randomly permute the unranked items with the argument shuffle_unranked,
# as demonstrated below. This algorithm is computationally efficient, but defaults to FALSE
# for backward compatibility.
set.seed(2233)
beach_init <- generate_initial_ranking(beach_tc, shuffle_unranked = TRUE)
head(beach_init)
# It is also possible to pick a random sample among all topological sorts.
# This requires first enumerating all possible sorts, and might hence be computationally
# demanding. Here is an example, where we reduce the data considerable to speed up computation.
small_tc <- beach_tc[beach_tc$assessor %in% 1:6 &
beach_tc$bottom_item %in% 1:4 & beach_tc$top_item %in% 1:4, ]
set.seed(123)
init_small <- generate_initial_ranking(tc = small_tc, n_items = 4, random = TRUE)
# Look at the initial rankings generated
init_small
# For this small dataset, we can also study the effect of setting shuffle_unranked=TRUE
# in more detail. We consider assessors 1 and 2 only.
# First is the deterministic ordering. This one is equal for each run.
generate_initial_ranking(tc = small_tc[small_tc$assessor %in% c(1, 2), ],
n_items = 4, shuffle_unranked = FALSE, random = FALSE)
# Next we shuffle the unranked, setting the seed for reproducibility.
# For assessor 1, item 2 is unranked, and by rerunning the code multiple times,
# we see that element (1, 2) indeed changes ranking randomly.
# For assessor 2, item 3 is unranked, and we can also see that this item changes
# ranking randomly when rerunning the function multiple times.
# The ranked items also change their ranking from one random realiziation to another,
# but their relative ordering is constant.
set.seed(123)
generate_initial_ranking(tc = small_tc[small_tc$assessor %in% c(1, 2), ],
n_items = 4, shuffle_unranked = TRUE, random = FALSE)
## Not run:
# We now give beach_init and beach_tc to compute_mallows. We tell compute_mallows
# to save the augmented data, in order to study the convergence.
model_fit <- compute_mallows(rankings = beach_init,
preferences = beach_tc,
nmc = 2000,
save_aug = TRUE)
# We can study the acceptance rate of the augmented rankings
assess_convergence(model_fit, parameter = "Rtilde")
# We can also study the posterior distribution of the consensus rank of each beach
model_fit$burnin <- 500
plot(model_fit, parameter = "rho", items = 1:15)
## End(Not run)
## Not run:
# The computations can also be done in parallel
library(parallel)
cl <- makeCluster(detectCores() - 1)
beach_tc <- generate_transitive_closure(beach_preferences, cl = cl)
beach_init <- generate_initial_ranking(beach_tc, cl = cl)
stopCluster(cl)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.