knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("magrittr") options("digits" = 5) abbreviated <- TRUE # change to FALSE for full names
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd.
The file eurovision.txt
, used below, is copied from "Eurovision Song
Contest 2009," Wikipedia, accessed May 13, 2018. It refers to
semi-final 1. More documentation is given in eurovision.Rd
[type
help(euro2009)
at the R
prompt].
First we specify the matrix as appearing in the Wikipedia page:
eurovision_table <- as.matrix(read.table("eurovision.txt")) eurovision_table
Each row corresponds to a contestant and each column to a judge. Note
that the matrix is not square: the last two columns correspond to
Germany and the UK, who were judges but not competitors. I have
removed the first column from the Wikipedia table (which gave the
total of the points), replaced self-voting entries with NA
, and
replaced blanks with 0
. Thus the first row corresponds to votes
cast for ME
, Montenegro. We see that BY
(Belarus) awarded them
three points, AM
(Armenia) five points, and so on. The first column
corresponds to the points awarded by Monetenegro. We see that they
placed gave Belarus (BY
) two points, Armenia (AM
) four points, and
so on. Their favourite was Bosnia and Herzegovina (BA
) to whom they
gave twelve points. The points system used was:
points <- c(12,10,8,7,6,5,4,3,2,1)
Variable points
gives the number of points awarded, under Eurovision
rules, to voters' first, second, third, etc choice. Points for any
competitor are added and the winner is the competitor with the most
points. However, in the hyper2
model, the numerical values
themselves do not affect the likelihood function; only the order of
the voters' preferences matters. The following R idiom translates
wiki_matrix
into a form suitable for analysis with hyper2
:
preference <- eurovision_table*0 for(i in seq_along(points)){ preference[eurovision_table == points[i]] <- i } countries <- data.frame( fullname = c("Montenegro", "Czech rep", "Belgium", "Belarus", "Sweden", "Armenia", "Andorra", "Switzerland", "Turkey", "Israel", "Bulgaria", "Iceland", "Macedonia", "Romania", "Finland", "Portugal", "Malta", "Bosnia Herz", "Germany", "UK"), abbreviation = c("ME","CZ","BE", "BY", "SW", "AM", "AD", "CH", "TR", "IL", "BG", "IS", "MK", "RO", "FI", "PT", "MT", "BA", "DE", "UK") ) if(abbreviated){ jj <- countries$abbreviation } else { jj <- countries$fullname } competitors <- as.character(jj[1:18]) colnames(preference) <- jj # voters; 20 countries (18 + DE + UK) rownames(preference) <- competitors rownames(eurovision_table) <- jj[1:18] colnames(eurovision_table) <- jj
In the above, matrix preference
records voters' first, second,
third, etc choice. A zero entry means no points (nul punkte!) and
NA
means that voter was forbidden from voting for that player
(countries cannot vote for themselves). The competitors were the
first 18 countries (the last two countries,
Germany and the UK, voted but did not compete).
preference
Now, take the first column of preference
. This represents the order
of preferences of Montenegro (ME
). Their favourite was [last row]
Bosnia & Herzegovina (BA
), who they gave rank 1 to (that is, 12
points). Their second favourite was Macedonia (MK
), their third was
Turkey (TR
), and so on. They were not allowed to vote for themselves,
which is why the first row is NA
. So the order was: BA
(first),
MK
(second) , TR
(third), IS
(fourth), RO
(fifth), IL
(sixth), AM
(seventh), FI
(eigthth), BY
(ninth), MT
(tenth).
The other countries (CZ
, BE
, SW
, CH
, BG
, PT
) came joint
last and did not receive any points from Montenegro. Note the final
two rows corresponding to votes cast by Germany and the UK, who did
not compete and therefore cast votes for all competitors.
We need to convert matrix preference
into a likelihood function,
here eurovision
:
eurovision <- hyper2() for(i in seq_len(ncol(preference))){ # cycle through the rows; each row is a voter d <- preference[,i,drop=TRUE] eurovision <- eurovision + ordervec2supp(d[!is.na(d)]) } # i loop closes
eurovision_maxp <- maxp(eurovision) eurovision_maxp
pie(eurovision_maxp)
Observe the small estimated strength for Czechoslovakia (CZ
) at
about 1e-06
. This is consistent with the second row of preference
which shows that noone gave them any points.
consistency(eurovision)
Now we can check null of equal strengths:
equalp.test(eurovision)
Thus the difference is about 122 units of support, surely significant.
points_scored <- rowSums(eurovision_table,na.rm=TRUE) # points ox <- rank(-points_scored) oy <- rank(-eurovision_maxp) # m = MLE oyp <- ordertrans(oy,names(ox)) par(pty='s') # square plot plot(ox,oyp,asp=1,pty='s',xlim=c(0,18),ylim=c(0,18),pch=16, xlab="official order",ylab="my order",main="Eurovision 2009") par(xpd=TRUE) # allow country names to appear outside plotting region for(i in seq_along(ox)){text(ox[i],oyp[i],names(ox)[i],pos=4,col='gray') } par(xpd=FALSE) # stop diagonal line from protruding beyond plotting region abline(0,1)
hyper3
idiomFirst we use hyper2
:
a <- read.table("euro_ranks.txt",header=TRUE) jj <- table(a$country) wanted_countries <- names(jj[jj >= 10]) a <- a[a$country %in% wanted_countries,] H <- hyper2() for(y in unique(a$year)){ H <- H + race((a$country)[a$year == y]) } summary(H)
Find the evaluate:
mH <- maxp(H)
dotchart(mH)
Now use hyper3
and follow RB_BB_LF()
in
plackett_luce_monster.Rmd
. The blocs follow those of Price 2020.
scandinavia <- c("Denmark","Finland","Iceland","Norway","Sweden") balkans <- c("Italy","Serbia","Albania","BosHerz") # red eastern <- c("Russia","Ukraine","Moldova","Armenia","Azerbaijan", "Malta","Israel") # purple black <- c("Hungary","Romania","Greece","Cyprus") # orange iberia <- c("Spain","Germany") # blue northern <- c("Denmark", "Estonia", "Finland", "France", "Iceland", "Lithuania", "Netherlands", "Norway", "Sweden", "UK") # green l <- list(balkans=balkans,eastern=eastern,black=black,iberia=iberia,northern=northern) ab <- c(balkans,eastern,black,iberia,northern) stopifnot(all(ab %in% wanted_countries)) stopifnot(all(wanted_countries %in% ab)) list2ec <- function(l){ jj <- stack(setNames(l,seq_along(l))) out <- as.integer(c(jj$ind)) names(out) <- jj$values return(out) } bloc <- function(lam){ out <- hyper3() e <- list2ec(l) for(y in unique(a$year)){ v <- a$country[a$year==y] ec <- e[names(e) %in% v] out <- out + cheering3(v=v, e=ec,h=rep(lam,length(unique(e)))) } return(out) }
o <- function(lam){ # maximize function o() H <- bloc(lam) return(maxp(H,give=TRUE,n=1)$likes) } ```r lam <- seq(from=0.6,to=2,len=20) L <- sapply(lam,o)
plot(lam,L-max(L),type="b",pch=16) abline(h=c(0,-2)) abline(v=1)
Following lines create eurovision.rda
, residing in the data/
directory of the package.
save(eurovision_table,eurovision_maxp,eurovision,file="eurovision.rda")
A. Price 2020. "Identifying voting blocs in the Eurovision Song Contest". Towards Data Science,
https://medium.com/towards-data-science/identifying-voting-blocs-in-the-eurovision-song-contest-4792065fc337
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.