The Academic Performance Index is computed for all California schools based on standardised testing of students. The data sets contain information for all schools with at least 100 students and for various probability samples of the data.
The full population data in
apipop are a data frame with 6194 observations on the following 37 variables.
School name (15 characters)
School name (40 characters)
reason for missing data
percentage of students tested
API in 2000
API in 1999
target for change in API
Change in API
Met school-wide growth target?
Met Comparable Improvement target
Met both targets
Eligible for awards program
Percentage of students eligible for subsidized meals
‘English Language Learners’ (percent)
percentage of students for whom this is the first year at the school
average class size years K-3
average class size years 4-6
Number of core academic courses
percent where parental education level is known
percent parents not high-school graduates
percent parents who are high-school graduates
percent parents with some college
percent parents with college degree
percent parents with postgraduate education
average parental education level
percent fully qualified teachers
percent teachers with emergency qualifications
number of students enrolled
number of students tested.
The other data sets contain additional variables
sampling weights and
fpc to compute finite population
corrections to variance.
apipop is the entire population,
apisrs is a simple random sample,
apiclus1 is a cluster sample of school districts,
a sample stratified by
apiclus2 is a two-stage
cluster sample of schools within districts. The sampling weights in
apiclus1 are incorrect (the weight should be 757/15) but are as
obtained from UCLA.
Data were obtained from the survey sampling help pages of UCLA Academic Technology Services, at http://www.ats.ucla.edu/stat/stata/Library/svy_survey.htm.
The API program and original data files are at http://api.cde.ca.gov/
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
library(survey) data(api) mean(apipop$api00) sum(apipop$enroll, na.rm=TRUE) #stratified sample dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) summary(dstrat) svymean(~api00, dstrat) svytotal(~enroll, dstrat, na.rm=TRUE) # one-stage cluster sample dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc) summary(dclus1) svymean(~api00, dclus1) svytotal(~enroll, dclus1, na.rm=TRUE) # two-stage cluster sample dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2) summary(dclus2) svymean(~api00, dclus2) svytotal(~enroll, dclus2, na.rm=TRUE) # two-stage `with replacement' dclus2wr<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2) summary(dclus2wr) svymean(~api00, dclus2wr) svytotal(~enroll, dclus2wr, na.rm=TRUE) # convert to replicate weights rclus1<-as.svrepdesign(dclus1) summary(rclus1) svymean(~api00, rclus1) svytotal(~enroll, rclus1, na.rm=TRUE) # post-stratify on school type pop.types<-xtabs(~stype, data=apipop) rclus1p<-postStratify(rclus1, ~stype, pop.types) dclus1p<-postStratify(dclus1, ~stype, pop.types) summary(dclus1p) summary(rclus1p) svymean(~api00, dclus1p) svytotal(~enroll, dclus1p, na.rm=TRUE) svymean(~api00, rclus1p) svytotal(~enroll, rclus1p, na.rm=TRUE)