demo/ui.R

library(shiny)

shinyUI(
  fluidPage(
    tags$head(
      tags$link(rel = "stylesheet", type = "text/css", href = "stylesheet.css")
    ),

    fixedRow(
	    column(12,
        tags$div(class = "flash",
				  textOutput("flash")
			  )
			),
			column(12,
        titlePanel("The Fair Reviewers Problem")
      ),
      column(10, offset = 1,
        p("Ahoy! Have you ever head of the Mêlée Island Academy for Piracy? It's the world's most popular school for
           young people - traditionally young men - to start their career path as a pirate. Some years ago the
           academy administration has realized that the diversity gap in piracy has navigated their field into a
           fatal situation: The piracy economy is constantly shrinking. There's a huge lack of good ideas, innovation
           and personnel."),
        p("But the academy leaders want fight these stormy waves and turn their ship around:
           Let's get more underrepresented people in piracy! To start out, they started offering
           special scholarships to motivate members of underrepresentated groups to enter a career in piracy.
           Unfortunately, they have a lot of applicants and only a few battle-tested reviewers - so
           they had to accept that not every reviewer can review each application."),
        p("After some years of practice, they had to fight a serious concern: The reviewers are rating
           very differently. Some of them are very graceful: They often make use of the highest points,
           whereas other reviewers are quite opposite: Mostly they only grant low point score. That's why the
           academy administration believes, the competetion is not fair enough, biased by the strictness levels
           of their reviewers. They decided to ask the Carribean's most excellent math expters, the Monkey Math Association™, for
           advice: Can they fix the situation?"),
        hr()
      ),
      column(10, offset = 1,
        h2("Modeling strictness - a naive approach"),
        uiOutput("MathJax"),
        p("First of all, we need to specify what the term 'rating' means. Given a set of
          reviewers, namely \\(\\mathcal{D}\\), and a set of reviewers, namely \\(\\mathcal{R}\\), the process of rating is a function"),
        p("$$

            \\operatorname{rat} : \\mathcal{D} \\times \\mathcal{R} \\rightarrow \\mathbb{R}

        $$"),
        p("that takes a document and a reviewer and assignes a point value to it. For the sake of ease
          we are going to restrict our cases and work with a discrete rating scale of integer numbers from \\(1\\) to \\(10\\) so
          that \\(\\operatorname{im}(\\operatorname{rat}) = \\{ 1, 2, … , 10 \\} \\subset \\mathbb{N} \\)"),
        p("Now we need to involve some tools of probability theory to have a proper language to express our problem. As we want to
          talk about the chances of the certain rating values in different situations, we need to use a concept named 'random variable' to
          model our problem. A random variable is a mapping that maps specific outcomes of an 'experiment' to
          a measurable space in a consistent manner. Classical examples for random variables are the number of points when
          rolling a dice. Each outcome (= position of the dice in spacetime) is being
          mapped to a number, i.e. the number the dice points to heaven. In our case the 'measurable space' could be the
          rating scale containing space \\(\\mathbb{R}\\). So let's start tinkering and find some random variables! Let \\(X\\) be the random variable that encodes the overall probability of a
          application, to get specific rating, that means:"),
        p("$$

            p_X(X = v) := \\text{The probability of an arbitrary unknown application to get rating points } v

        $$"),
        p("To obey the rules o a random variable: \\(X\\) is mapping of the reviewer's mental process of reading and judging an application to a number, namely the points
          assigned within the rating scale. The crucial point is: If we would know everything of X for sure, we could stop: We would have solved our problem. But we don't. We
          need to poke along. Since we are acting on a real life dataset, there is a possile way out: We are going to approximate the
          probabilities for X by:"),
        p("$$

            p_X(X = v) := \\frac{\\text{Number of ratings that assigned points v to a document}}{\\text{Number of all ratings}}

        $$"),
        p("Our random variable \\(X\\) also has some natural sisters and brothers: Other closely related random variables. For example, if
          we fixiate a specific reviewer \\(l \\in \\mathcal{R}\\) we can observe all ratings that \\(l\\) has submitted and
          define a random variable \\(X_l\\) by"),
        p("$$

            p(X_l = v) := \\frac{\\text{Number of ratings of reviewer $l$ that assigned point value v to a document}}{\\text{Number of all ratings $l$ submitted}}

        $$"),
        p("On the other hand, if we choose a fixed document \\(d\\) we can also observe the ratings that \\(d\\) was involved in:"),
        p("$$

            p(X_d = v) := \\frac{\\text{Number of ratings of  document $d$ with point value v to a document}}{\\text{Number of all ratings $d$ got}}

        $$"),
        p("Note that we will denote by \\(\\mathcal{R}_d\\) the set of reviewers that have reviewed the fixed document \\(d\\) and
          by \\(\\mathcal{D}_l\\) the set of documents that have been reviewed the by a fixed reviewer \\(l\\). If you write down
          all ratings in a table like this:"),
        dataTableOutput("table0"),
        p(" then \\(X_l\\) corresponds to a reviewer column whereas \\(X_d\\) corresponds to a document row."),
        br(),
        p("We will now work with these random variables and compute some quantities related to them. The most prominent examples are
          the arithmetic means of \\(X_l\\) and \\(X_d\\)"),
        p("$$

            \\overline{X}_d := \\frac{1}{|\\mathcal{R}_d|} \\sum_{l \\in \\mathcal{R}_d} \\operatorname{rat}(d, l)

        $$"),
        p("$$

            \\overline{X}_l := \\frac{1}{|\\mathcal{D}_l|} \\sum_{d \\in \\mathcal{D}_l} \\operatorname{rat}(d, l)

        $$"),
        p("$$

          \\overline{X} := \\frac{1}{|\\mathcal{D}|} \\sum_{d \\in \\mathcal{D}} \\overline{X}_d

        $$"),
        br(),
        p("With these ingredients we could express a first quantity that encodes strictness: For a fixed reviewer \\(l\\)
           let's define the arithmetic strictness factor \\(S_l\\) by"),
        p("$$

            S_l := \\frac{\\overline{X}}{\\overline{X_l}}

        $$"),
        p("$$

            \\operatorname{E}[X] = \\sum_{i=1}^\\infty x_i p_i

        $$"),
        p("$$

          \\operatorname{E}[X] = \\sum_{x \\in \\operatorname{scale}} x p(x)

        $$"),
        p("$$

          S^*_l := \\frac{E[X]}{E[X_l]}

        $$")
      ),
      column(10, offset = 1,
        h2("Do not reinvent the wheel: Statistics has a better way for us!"),
        p("the z-transform, or, 'standard score'."),
        p("Idea: Assume that each reviwer column \\(X_l\\) is normally distributed. Compute then"),
        p("$$
          Z = \\frac{X - \\operatorname{E}[X]}{\\sigma[X]}
        $$")
      ),
      column(10, offset = 1,
        h2("Comparison of standard score and strictness factors"),
        p("TODO: Compute some values :)")
      ),
      column(10, offset = 1,
        dataTableOutput("table1")
      )
    )
  )
)
neumanrq/fairreviewers documentation built on May 24, 2019, 5:06 a.m.