This data comes from publicly available aggregated test scores of a large midwestern state. Each row represents scores for school A in grade X and then scores in school A and grade X+1. Additionally, some regression diagnostics and results from a predictive model of test scores in grade X+1 are included.

A data frame with 19985 observations on the following 16 variables.

`district_id`

a numeric vector

`school_id`

a numeric vector

`subject`

a factor with levels

`math`

`read`

representing the subject of the test scores in the row`grade`

a numeric vector

`n1`

a numeric vector for the count of students in the school and grade in t

`ss1`

a numeric vector for the scale score in t

`n2`

a numeric vector for the count of students in the school and grade in t+1

`ss2`

a numeric vector for the mean scale score in t+1

`predicted`

a numeric vector of the predicted ss2 for this observation

`residuals`

a numeric vector of residuals from the predicted ss2

`resid_z`

a numeric vector of standardized residuals

`resid_t`

a numeric vector of studentized residuals

`cooks`

a numeric vector of cooks D for the residuals

`test_year`

a numeric vector representing the year the test was taken

`tprob`

a numeric vector representing the probability of a residual appearing

`flagged_t95`

a numeric vector

These data were fit with a statistical model by a large newspaper to investigate unusual gains in test scores. Fifty separate models were fit representing all unique combinations of grade,year, and subject

