createCutoffsDF: Create Cutoffs Dataframe

Description Usage Arguments Details Modes

Description

createCutoffsDF is an internal function, which creates a dataframe with identical cutoff values for all ZIP codes (if type = "unadj"), or quantile cutoffs in a ZIP code (if type = "perc" or type = "perc.resolve.ties"). This function is called extensively by the findCutoffs function.

Usage

1

Arguments

X

Numeric matrix of size n x p, where n is the number is restaurants to be graded and p is the number of inspections to be used in grade assignment. Entry X[i,j] represents the inspection score for the ith restaurant in the jth most recent inspection.

z

Character vector of length n representing ZIP codes (or other subunits within a jurisdiction). z[i] is the ZIP code corresponding to the restaurant with inspection scores in row i of X.

gamma

Numeric vector representing absolute grade cutoffs or quantiles, depending on type variable value. Entries in gamma should be increasing, with gamma[1] <= gamma[2] etc (this is related to the "Warning" section and larger scores being associated with higher risk). If type = "perc" or type = "perc.resolve.ties", gamma values represent quantiles and should take on values between 0 and 1.

type

Character string that is one of "unadj", "perc", or "perc.resolve.ties", and that indicates the grading algorithm to be implemented.

Details

createCutoffsDF takes in a matrix of restaurants' scores and a vector corresponding to restaurants' ZIP codes, and outputs a data frame of cutoff scores to be used in grade classification. The returned ZIP code cutoff data frame has one row for each unique ZIP code and has (length(gamma)+1) columns, corresponding to one column for the ZIP code name, and (length(gamma)) cutoff scores separating the (length(gamma)+1) grading categories. Across each ZIP code's row, cutoff scores increase and we assume, as in the King County (WA) case, that greater risk is associated with larger inspection scores. (If scores are decreasing in risk, users should transform inspection scores before utilizing functions in the QuantileGradeR package with a simple function such as f(score) = - score.)

The way in which cutoff scores are calculated for each ZIP code depends on the value of the type variable. The type variable can take one of three values (see later).

Modes

type = "unadj" creates a ZIP code cutoff data frame with the same cutoff scores (meaningful values in a jurisdiction's inspection system that are contained in the vector gamma) for all ZIP codes. This ZIP code data frame can then be used to carry out "unadjusted" grading, in which a restaurant's most recent routine inspection score is compared to these cutoffs.

type = "perc" takes in a vector of quantiles, gamma, and returns a data frame of the scores in each ZIP code corresponding to these quantiles (using the "Nearest Rank" definition of quantile).

type = "perc.resolve.ties" takes in a vector of quantiles, gamma, and instead of returning (for B/C cutoffs, for example) the scores in each ZIP code that result in at least (gamma[2] x 100)% of restaurants in the ZIP code scoring less than or equal to these cutoffs, type = "perc.resolve.ties" takes into account the fact that ties exist in ZIP codes. Returned scores for A/B cutoffs are those that result in the closest percentage of restaurants in the ZIP code scoring less than or equal to the A/B cutoff to the desired percentage, (gamma[1] x 100)%. Similarly, B/C cutoffs are the scores in the ZIP code that result in the closest percentage of restaurants in the ZIP code scoring less than or equal to the B/C cutoff and more than the A/B cutoff to the desired percentage, ((gamma[2] - gamma[1]) x 100)%.


QuantileGradeR documentation built on May 2, 2019, 6:41 a.m.