case1201: State Average SAT Scores

Description Usage Format Source Examples

Description

Data on the average SAT scores for US states in 1982 and possible associated factors.

Usage

1

Format

A data frame with 50 observations on the following 8 variables.

State

US state

SAT

state averages of the total SAT (verbal + quantitative) scores

Takers

the percentage of the total eligible students (high school seniors) in the state who took the exam

Income

the median income of families of test–takers (in hundreds of dollars)

Years

the average number of years that the test–takers had formal studies in social sciences, natural sciences and humanities

Public

the percentage of the test–takers who attended public secondary schools

Expend

the total state expenditure on secondary schools (in hundreds of dollars per student)

Rank

the median percentile ranking of the test–takers within their secondary school classes

Source

Ramsey, F.L. and Schafer, D.W. (2013). The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed), Cengage Learning.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
str(case1201)
attach(case1201)

## EXPLORATION
logTakers  <- log(Takers)
myMatrix   <- cbind(SAT, logTakers,Income, Years, Public, Expend, Rank)
if(require(car)){   # Use the car library   
scatterplotMatrix(myMatrix, diagonal="histogram", smooth=FALSE)  
  }                  
State[Public < 50] # Identify state with low Public (Louisiana)
State[Expend > 40] # Alaska
myLm1    <- lm(SAT ~ logTakers + Income+ Years + Public + Expend + Rank)
plot(myLm1,which=1)         
plot(myLm1,which=4)  # Cook's Distance       
State[29] # Identify State number 29?  ([1] Alaska) 
plot(myLm1,which=5)        
if(require(car)){   # Use the car library   
  crPlots(myLm1)  # Partial residual plot
}
myLm2 <- update(myLm1, ~ . ,subset=(State != "Alaska"))  
plot(myLm2,which=1)
plot(myLm2,which=4)
if(require(car)){   # Use the car library   
  crPlots(myLm2) # Partial residual plot
}
## RANK STATES ON SAT SCORES, ADJUSTED FOR Takers AND Rank
myLm3        <- lm(SAT ~ logTakers + Rank) 
myResiduals  <- myLm3$res 
myOrder      <- order(myResiduals)  
State[myOrder] 

## DISPLAY FOR PRESENTATION
dotchart(myResiduals[myOrder], labels=State[myOrder],
  xlab="SAT Scores, Adjusted for Percent Takers and HS Ranks (Deviation From Average)",
  main="States Ranked by Adjusted SAT Scores",
  bg="green", cex=.8)
abline(v=0, col="gray")

## VARIABLE SELECTION (FOR RANKING STATES AFTER ACCOUNTING FOR ALL VARIABLES)
expendSquared <- Expend^2   
if(require(leaps)){   # Use the leaps library   
  mySubsets   <- regsubsets(SAT ~ logTakers + Income+ Years + Public + Expend + 
    Rank + expendSquared, nvmax=8, data=case1201, subset=(State != "Alaska")) 
  mySummary <- summary(mySubsets) 
  p <- apply(mySummary$which, 1, sum) 
  plot(p, mySummary$bic, ylab = "BIC")  
  cbind(p,mySummary$bic) 
  mySummary$which[4,]  
  myLm4 <- lm(SAT ~ logTakers + Years + Expend + Rank, subset=(State != "Alaska"))
  summary(myLm4)

## DISPLAY FOR PRESENTATION
  myResiduals2 <- myLm4$res
  myOrder2 <- order(myResiduals2)
  newState <- State[State != "Alaska"]
  newState[myOrder2] 
  dotchart(myResiduals2[myOrder2], labels=State[myOrder2],
    xlab="Adjusted SAT Scores (Deviation From Average Adjusted Value)",
    main=paste("States Ranked by SAT Scores Adjusted for Demographics",
               "of Takers and Education Expenditure", sep = " "),
    bg="green", cex = .8)
  abline(v=0, col="gray")
}

detach(case1201)

Example output

'data.frame':	50 obs. of  8 variables:
 $ State : Factor w/ 50 levels "Alabama","Alaska",..: 15 41 34 16 27 26 23 44 50 49 ...
 $ SAT   : int  1088 1075 1068 1045 1045 1033 1028 1022 1017 1011 ...
 $ Takers: int  3 2 3 5 5 8 7 4 5 10 ...
 $ Income: int  326 264 317 338 293 263 343 333 328 304 ...
 $ Years : num  16.8 16.1 16.6 16.3 17.2 ...
 $ Public: num  87.8 86.2 88.3 83.9 83.6 93.7 78.3 75.2 97 77.3 ...
 $ Expend: num  25.6 19.9 20.6 27.1 21.1 ...
 $ Rank  : num  89.7 90.6 89.8 86.3 88.5 86.4 83.4 85.9 87.5 84.2 ...
Loading required package: car
Loading required package: carData
Warning message:
In applyDefaults(diagonal, defaults = list(method = "adaptiveDensity"),  :
  unnamed diag arguments, will be ignored
[1] Louisiana
50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
[1] Alaska
50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
[1] Alaska
50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
 [1] SouthCarolina NorthCarolina Mississippi   Georgia       Texas        
 [6] Alabama       Arkansas      Louisiana     WestVirginia  Nevada       
[11] Indiana       Kentucky      Hawaii        Oklahoma      California   
[16] Florida       Idaho         Utah          Wyoming       Michigan     
[21] Maine         Oregon        Pennsylvania  Missouri      Arizona      
[26] Alaska        SouthDakota   NewJersey     RhodeIsland   NewMexico    
[31] Ohio          Virginia      Maryland      Delaware      Tennessee    
[36] NorthDakota   Vermont       Nebraska      Illinois      Massachusetts
[41] Kansas        NewYork       Colorado      Wisconsin     Washington   
[46] Minnesota     Montana       Connecticut   Iowa          NewHampshire 
50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
Loading required package: leaps

Sleuth3 documentation built on May 2, 2019, 6:41 a.m.