Description Usage Arguments Details Value
percentileSeek
returns a set of percentiles to be applied across
subunits (e.g. ZIP codes) of a larger area (e.g. a jurisdiction), so as to
rank items within each subunit (e.g. restaurants) and group these items into
grade categories. percentileSeek
allows the user to set the desired
global proportion of items in each grade category.
1 2 | percentileSeek(scores, z, desired.props, restaurant.tol = 10,
max.iterations = 20, resolve.ties = FALSE)
|
scores |
Numeric vector of size |
z |
Character vector representing ZIP codes. |
desired.props |
Numeric vector representing desired global grade
proportions across the entire jurisdiction. |
restaurant.tol |
Integer value representing the maximum difference in
number of restaurants suggested by |
max.iterations |
Integer value specifying the maximum number of
calls of the |
resolve.ties |
Boolean value specifying interpretation of how the
function's returned percentiles will be applied across subunits. Should
as close to (desired.props[1])% of restaurants in a ZIP code receive
an "A" grade, and as close to (desired.props[2])% of restaurants in a
ZIP code receive "B" grades ( |
In our documentation, we use the language “ZIP code” and “restaurant”, however, our algorithms and code can be applied much more broadly to other inspected or scored entities; and percentile cutoffs can be sought in subunits (of a larger area) that are not ZIP codes. Where “ZIP code” is referenced, please read “ZIP code or other subunit of a larger area” and “restaurant” should read “restaurant or other entity to be graded”.
percentileSeek
was designed for situations in which a significant
number of ties in the scores of items within subunits (e.g. ties in restaurant
inspection scores in ZIP codes) result in the obvious choice of percentiles
(namely those obtained from the desired proportions) not yielding the desired
proportions globally. percentileSeek
will iterate over different values
for the first percentile (using the update process described in the
updateGamma
documentation) until the proportion of (gradeable)
restaurants scoring “A” grades (when ZIP cutoffs are percentile values) is
within (restaurant.tol/ no.gradeable.rests)
of the desired proportion
of As, where no.gradeable.rests
is the number of gradeable restaurants,
and gradeable restaurants are those that have both ZIP code and inspection
score information. The algorithm will then seek to find a larger percentile to
match the proportion of gradeable restaurants scoring “B” grades with the
desired proportion of Bs and so on, until the proportions of restaurants
gaining the top (lengh(desired.props) - 1)
grades are within the required
tolerance of their desired proportions. Note: there is thus no requirement
that the proportion of restaurants gaining the worst grade matches the desired
proportion for worst grade - these can be quite different (depending on the
number of restaurants being graded and the number of grade categories) and no
error will be reported.
Of course, percentileSeek
can only find a solution if one exists. It
could be the case that it is simply not possible with a particular set of
scores to match the desired proportions. We have included some failsafes to
catch some of the simplest instances in which no solution will exist. For instance, one
possible reason for failure is selecting a desired proportion of “A” grades that
is below the global minimum proportion of “A”s. Totaling the number of
restaurants with the best inspection scores in their ZIP codes and dividing by
the number of gradeable restaurants provides the global minimum proportion of
“A”s. Running percentileSeek
can be a useful way to test whether a
solution is likely to exist. If reported results of the percentileSeek
function are outwith the standard [0, 1] interval for percentiles, or if the
number of iterations exceeds the maximum number of iterations, this could be
indicative that no solution exists.
An example of when the percentileSeek
function could be used outside
of the restaurant context is if you were tasked with finding the top 3 percent
of students in a state. We know that each school has its own GPA system and
so comparing students by raw GPA does not make sense. We could thus desire to
perform a percentile adjustment at each school and select the top 3 percent of
students at each school. Unfortunately, some schools do not utilize the full
spectrum of GPA scores available and so it may be the case that the top 5
percent of students at school 1 have the same GPA and cannot be
distinguished from one another. Using percentileSeek
with each
restaurant replaced by a student, each restaurant's inspection score replaced
by the student's GPA and each ZIP code replaced by a school, we could
investigate whether it is possible to satisfy the 3 percent globally desired
proportion. percentileSeek
would reduce the percentile applied across
schools (from the initial 3 percent), which would still select the 5 percent
of students at school 1 for nomination, but would try to take advantage of
the fact that some schools do use more of their GPA scale. Of course, issues
of fairness do arise and one wonders why school 2, which distinguishes its
students better than school 1, should have fewer students represented in the
globally selected 3 percent. We only advocate the use of percentileSeek
for situations in which there is good reason to demand certain global
proportions. In the school selection case, this may be that there are only
finite resources available to be given to the top 3 percent of students and it is
simply not possible to extend these resources to the top 3 percent of students
at each school. In the restaurant case, we desire to select the top
restaurants in each ZIP code to be assigned an 'A' grade; however we also do
not want to design a grading system that is seen to inflate grades compared to
an unadjusted grading system (one based on absolute uniform grade cutoffs
across the whole jurisdiction).
A numeric vector with the percentiles to be applied to each ZIP code so as to achieve the desired proportion of grades.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.