gradeAllBus: Grade Businesses.

Description Usage Arguments Details Value Examples

Description

gradeAllBus takes in a vector of business inspection scores, business ZIP codes and a data frame of ZIP code cutoff scores and returns a vector of business grades.

Usage

1
gradeAllBus(scores, z, zip.cutoffs)

Arguments

scores

Numeric vector of length n, where n is the number is restaurants to be graded. Each entry is the inspection score for one business.

z

Character vector of length n, where each entry is the ZIP code (or other geographic area) of a business. The order of businesses in z is the same as the order of businesses in scores.

zip.cutoffs

A dataframe with the first column containing all of the ZIP codes in z and later columns containing cutoff scores for each ZIP code for grade classification. Cutoff scores for each ZIP code should be ordered from lowest score in column 2 (representing the cutoff for the best grade) to the largest cutoff score in the final column (representing the cutoff inspection score for the second worst grade).

Details

In our documentation, we use the language ”ZIP code” and ”restaurant”, however, our grading algorithm and our code can be applied to grade other inspected entities; and percentile cutoffs can be sought in subunits of a jurisdiction that are not ZIP codes. For example, it may make sense to search for percentile cutoffs in an inspector's allocated inspection area or within a census tract. We chose to work with ZIP codes in our work because of the fact that area assignments for inspectors in King County (WA) tend to be single or multiple ZIP codes, and we desired to assign grades based on how a restaurant's scores compare to other restaurants assessed by the same inspector. We could have calculated percentile cutoffs in an inspector's allocated area, but we also desired to create a grading system that was readily explainable, and the process for allocating an area to an inspector is non-trivial. Where ”ZIP code” is referenced, please read ”ZIP code or other subunit of a jurisdiction” and ”restaurant” should read ”restaurant or other entity to be graded”.

gradeAllBus takes a vector of inspection scores (one score per restaurant - this may be a mean or the result of a single inspection), a vector of ZIP codes and a dataframe of ZIP code cutoffs. It compares each restaurant's inspection score to cutoff scores in the restaurant's ZIP code. It finds the smallest cutoff score in the restaurant's ZIP code that the restaurant's inspection score is less than or equal to - let's say this is the (letter.index)th cutoff score - and returns the (letter.index)th letter of the alphabet as the grade for the restaurant. The returned vector of grades maintains the order of businesses in vector inputs scores and in z).

Value

A character vector of length n, with each entry corresponding to the grade that the restaurant received.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Adjusted Grading (without tie resolution):
 zipcode.cutoffs.df <- findCutoffs(X.kc, zips.kc, gamma = c(0, 30))
 mean.scores <- rowMeans(X.kc, na.rm = TRUE)
 adj.grades <- gradeAllBus(mean.scores, zips.kc, zipcode.cutoffs.df)

# Unadjusted Grading:
 unadj.cutoffs <- findCutoffs(X.kc, zips.kc, gamma = c(0, 30), type = "unadj")
 unadj.grades <- gradeAllBus(scores = X.kc[,c(1)], zips.kc, zip.cutoffs = unadj.cutoffs)

# Proportion A/B/C in each ZIP code
# Unadjusted
 foo1 <- round(t(table(unadj.grades, zips.kc))/apply(table(unadj.grades, zips.kc), 2, sum), 2)
# Adjusted
 foo2 <- round(t(table(adj.grades, zips.kc))/apply(table(adj.grades, zips.kc), 2, sum), 2)

# Correlation plots of unadjusted vs. adjusted grade proportions
# in ZIP codes for different grades
# Proportions A
 plot(foo1[,1], foo2[,1], xlim=range(cbind(foo1[,1],foo2[,1])),
 ylim=range(cbind(foo2[,1],foo1[,1])), pch=16,
 cex=sqrt(apply(table(adj.grades, zips.kc), 2, sum)/pi)*0.3,
 main = "Proportion A in ZIP Codes", xlab = "Unadjusted", ylab = "Adjusted")
# Proportions B
 plot(foo1[,2], foo2[,2],xlim=range(cbind(foo1[,2],foo2[,2])),
 ylim=range(cbind(foo2[,2],foo1[,2])),pch=16,
 cex=sqrt(apply(table(adj.grades,zips.kc),2,sum)/pi)*0.3,
 main = "Proportion B in ZIP Codes", xlab = "Unadjusted", ylab = "Adjusted")
# Proportions C
 plot(foo1[,3], foo2[,3],xlim=range(cbind(foo1[,3],foo2[,3])),
 ylim=range(cbind(foo2[,3],foo1[,3])),pch=16,
 cex=sqrt(apply(table(adj.grades,zips.kc),2,sum)/pi)*0.3,
 main = "Proportion C in ZIP Codes", xlab = "Unadjusted", ylab = "Adjusted")

King-County-Restaurant-Grading/DineSafeR documentation built on May 8, 2019, 4:50 p.m.