Description Usage Format Source References See Also Examples
Data on employees from one job category (skilled, entry–level clerical) of a bank that was sued for sex discrimination. The data are on 32 male and 61 female employees, hired between 1965 and 1975.
1 |
A data frame with 93 observations on the following 7 variables.
Annual salary at time of hire
Salary as of March 1975
Sex of employee
Seniority (months since first hired)
Age of employee (in months)
Education (in years)
Work experience prior to employment with the bank (months)
Ramsey, F.L. and Schafer, D.W. (2013). The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed), Cengage Learning.
Roberts, H.V. (1979). Harris Trust and Savings Bank: An Analysis of Employee Compensation, Report 7946, Center for Mathematical Studies in Business and Economics, University of Chicago Graduate School of Business.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | str(case1202)
attach(case1202)
## EXPLORATION
logSal <- log(Bsal)
myMatrix <- cbind (logSal, Senior,Age, Educ, Exper)
if(require(car)){ # Use the car library
scatterplotMatrix(myMatrix, smooth=FALSE, diagonal="histogram",
groups=Sex, col=c("red","blue") )
}
myLm1 <- lm(logSal ~ Senior + Age + Educ + Exper + Sex)
plot(myLm1, which=1)
plot(myLm1, which=4) # Cook's Distance
if(require(car)){ # Use the car library
crPlots(myLm1) # Partial residual plots
}
ageSquared <- Age^2
ageCubed <- Age^3
experSquared <- Exper^2
experCubed <- Exper^3
myLm2 <- lm(logSal ~ Senior + Age + ageSquared + ageCubed +
Educ + Exper + experSquared + experCubed + Sex)
plot(myLm2, which=1) # Residual plot
plot(myLm1, which=4) # Cook's distance
if(require(leaps)){ # Use the leaps library
mySubsets <- regsubsets(logSal ~ (Senior + Age + Educ + Exper +
ageSquared + experSquared)^2, nvmax=25, data=case1202)
mySummary <- summary(mySubsets)
p <- apply(mySummary$which, 1, sum)
plot(mySummary$bic ~ p, ylab = "BIC")
cbind(p,mySummary$bic)
mySummary$which[8,] # Note that Age:ageSquared = ageCubed
}
myLm3 <- lm(logSal ~ Age + Educ + ageSquared + Senior:Educ +
Age:Exper + ageCubed + Educ:Exper + Exper:ageSquared)
summary(myLm3)
myLm4 <- update(myLm3, ~ . + Sex)
summary(myLm4)
myLm5 <- update(myLm4, ~ . + Sex:Age + Sex:Educ + Sex:Senior +
Sex:Exper + Sex:ageSquared)
anova(myLm4, myLm5)
## INFERENCE AND INTERPRETATION
summary(myLm4)
beta <- myLm4$coef
exp(beta[6])
exp(confint(myLm4,6))
# Conclusion: The median beginning salary for males was estimated to be 12%
# higher than the median salary for females with similar values of the available
# qualification variables (95% confidence interval: 7% to 17% higher).
## DISPLAY FOR PRESENTATION
years <- Exper/12 # Change months to years
plot(Bsal ~ years, log="y", xlab="Previous Work Experience (Years)",
ylab="Beginning Salary (Dollars); Log Scale",
main="Beginning Salaries and Experience for 61 Female and 32 Male Employees",
pch= ifelse(Sex=="Male",24,21), bg = "gray",
col= ifelse(Sex=="Male","blue","red"), lwd=2, cex=1.8 )
myLm6 <- lm(logSal ~ Exper + experSquared + experCubed + Sex)
beta <- myLm6$coef
dummyExper <- seq(min(Exper),max(Exper),length=50)
curveF <- beta[1] + beta[2]*dummyExper + beta[3]*dummyExper^2 +
beta[4]*dummyExper^3
curveM <- curveF + beta[5]
dummyYears <- dummyExper/12
lines(exp(curveF) ~ dummyYears, lty=1, lwd=2,col="red")
lines(exp(curveM) ~ dummyYears, lty = 2, lwd=2, col="blue")
legend(28,8150, c("Male","Female"),pch=c(24,21), pt.cex=1.8, pt.lwd=2,
pt.bg=c("gray","gray"), col=c("blue","red"), lty=c(2,1), lwd=2)
detach(case1202)
|
'data.frame': 93 obs. of 7 variables:
$ Bsal : int 5040 6300 6000 6000 6000 6840 8100 6000 6000 6900 ...
$ Sal77 : int 12420 12060 15120 16320 12300 10380 13980 10140 12360 10920 ...
$ Sex : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 2 2 2 2 ...
$ Senior: int 96 82 67 97 66 92 66 82 88 75 ...
$ Age : int 329 357 315 354 351 374 369 363 555 416 ...
$ Educ : int 15 15 15 12 12 15 16 12 12 15 ...
$ Exper : num 14 72 35.5 24 56 41.5 54.5 32 252 132 ...
Loading required package: car
Loading required package: carData
Warning message:
In applyDefaults(diagonal, defaults = list(method = "adaptiveDensity"), :
unnamed diag arguments, will be ignored
Loading required package: leaps
(Intercept) Senior Age
TRUE FALSE TRUE
Educ Exper ageSquared
TRUE FALSE TRUE
experSquared Senior:Age Senior:Educ
FALSE FALSE TRUE
Senior:Exper Senior:ageSquared Senior:experSquared
FALSE FALSE FALSE
Age:Educ Age:Exper Age:ageSquared
FALSE TRUE TRUE
Age:experSquared Educ:Exper Educ:ageSquared
FALSE TRUE FALSE
Educ:experSquared Exper:ageSquared Exper:experSquared
FALSE TRUE FALSE
ageSquared:experSquared
FALSE
Call:
lm(formula = logSal ~ Age + Educ + ageSquared + Senior:Educ +
Age:Exper + ageCubed + Educ:Exper + Exper:ageSquared)
Residuals:
Min 1Q Median 3Q Max
-0.230710 -0.050695 0.004412 0.051503 0.195887
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.478e+00 5.798e-01 9.448 7.45e-15 ***
Age 1.767e-02 3.633e-03 4.864 5.31e-06 ***
Educ 5.875e-02 9.296e-03 6.320 1.20e-08 ***
ageSquared -3.799e-05 7.299e-06 -5.204 1.36e-06 ***
ageCubed 2.614e-08 4.826e-09 5.416 5.68e-07 ***
Educ:Senior -3.110e-04 7.697e-05 -4.040 0.000118 ***
Age:Exper 1.358e-05 2.880e-06 4.716 9.46e-06 ***
Educ:Exper -1.086e-04 4.617e-05 -2.352 0.020996 *
ageSquared:Exper -1.697e-08 3.658e-09 -4.639 1.27e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.09101 on 84 degrees of freedom
Multiple R-squared: 0.547, Adjusted R-squared: 0.5039
F-statistic: 12.68 on 8 and 84 DF, p-value: 8.856e-12
Call:
lm(formula = logSal ~ Age + Educ + ageSquared + ageCubed + Sex +
Educ:Senior + Age:Exper + Educ:Exper + ageSquared:Exper)
Residuals:
Min 1Q Median 3Q Max
-0.173459 -0.037584 0.004244 0.047305 0.192259
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.929e+00 5.105e-01 11.614 < 2e-16 ***
Age 1.480e-02 3.200e-03 4.626 1.36e-05 ***
Educ 4.957e-02 8.253e-03 6.006 4.83e-08 ***
ageSquared -3.097e-05 6.473e-06 -4.784 7.36e-06 ***
ageCubed 2.108e-08 4.296e-09 4.907 4.56e-06 ***
SexMale 1.115e-01 2.092e-02 5.330 8.29e-07 ***
Educ:Senior -3.206e-04 6.686e-05 -4.795 7.06e-06 ***
Age:Exper 9.231e-06 2.631e-06 3.509 0.000729 ***
Educ:Exper -6.919e-05 4.077e-05 -1.697 0.093388 .
ageSquared:Exper -1.213e-08 3.304e-09 -3.670 0.000427 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.07903 on 83 degrees of freedom
Multiple R-squared: 0.6625, Adjusted R-squared: 0.6259
F-statistic: 18.1 on 9 and 83 DF, p-value: 3.109e-16
Analysis of Variance Table
Model 1: logSal ~ Age + Educ + ageSquared + ageCubed + Sex + Educ:Senior +
Age:Exper + Educ:Exper + ageSquared:Exper
Model 2: logSal ~ Age + Educ + ageSquared + ageCubed + Sex + Educ:Senior +
Age:Exper + Educ:Exper + ageSquared:Exper + Age:Sex + Educ:Sex +
Sex:Senior + Sex:Exper + ageSquared:Sex
Res.Df RSS Df Sum of Sq F Pr(>F)
1 83 0.51839
2 78 0.51429 5 0.0041066 0.1246 0.9865
Call:
lm(formula = logSal ~ Age + Educ + ageSquared + ageCubed + Sex +
Educ:Senior + Age:Exper + Educ:Exper + ageSquared:Exper)
Residuals:
Min 1Q Median 3Q Max
-0.173459 -0.037584 0.004244 0.047305 0.192259
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.929e+00 5.105e-01 11.614 < 2e-16 ***
Age 1.480e-02 3.200e-03 4.626 1.36e-05 ***
Educ 4.957e-02 8.253e-03 6.006 4.83e-08 ***
ageSquared -3.097e-05 6.473e-06 -4.784 7.36e-06 ***
ageCubed 2.108e-08 4.296e-09 4.907 4.56e-06 ***
SexMale 1.115e-01 2.092e-02 5.330 8.29e-07 ***
Educ:Senior -3.206e-04 6.686e-05 -4.795 7.06e-06 ***
Age:Exper 9.231e-06 2.631e-06 3.509 0.000729 ***
Educ:Exper -6.919e-05 4.077e-05 -1.697 0.093388 .
ageSquared:Exper -1.213e-08 3.304e-09 -3.670 0.000427 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.07903 on 83 degrees of freedom
Multiple R-squared: 0.6625, Adjusted R-squared: 0.6259
F-statistic: 18.1 on 9 and 83 DF, p-value: 3.109e-16
SexMale
1.117974
2.5 % 97.5 %
SexMale 1.072404 1.165481
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.