Description Usage Format Source References Examples
Data relating air pollution and mortality, frequently used for illustrations in ridge regression and related tasks.
1 | data("AirPollution")
|
A data frame containing 60 observations on 16 variables.
Average annual precipitation in inches.
Average January temperature in degrees Fahrenheit.
Average July temperature in degrees Fahrenheit.
Percentage of 1960 SMSA population aged 65 or older.
Average household size.
Median school years completed by those over 22.
Percentage of housing units which are sound and with all facilities.
Population per square mile in urbanized areas, 1960.
Percentage of non-Caucasian population in urbanized areas, 1960.
Percentage employed in white collar occupations.
Percentage of families with income < USD 3000.
Relative hydrocarbon pollution potential.
Relative nitric oxides potential.
Relative sulphur dioxide potential.
Annual average percentage of relative humidity at 13:00.
Total age-adjusted mortality rate per 100,000.
http://lib.stat.cmu.edu/datasets/pollution
McDonald, G.C. and Schwing, R.C. (1973). Instabilities of Regression Estimates Relating Air Pollution to Mortality. Technometrics, 15, 463–482.
Miller, A.J. (2002). Subset Selection in Regression. New York: Chapman and Hall. Related software can be found online at http://users.bigpond.net.au/amiller/.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## load data (with logs for relative potentials)
data("AirPollution", package = "mcsSubset")
for (i in 12:14) AirPollution[[i]] <- log(AirPollution[[i]])
## fit subsets
xs <- mcsSubset(mortality ~ ., data = AirPollution)
plot(xs)
## summary with BIC
sx <- summary(xs, penalty = log(nrow(AirPollution)))
print(sx)
## refit best model
lm6 <- refit(xs, size = 6)
summary(lm6)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.