BostonHousing: Boston Housing Data

Description Usage Format Source References Examples

Description

Housing data for 506 census tracts of Boston from the 1970 census. The dataframe BostonHousing contains the original data by Harrison and Rubinfeld (1979), the dataframe BostonHousing2 the corrected version with additional spatial information (see references below).

Usage

1
2

Format

The original data are 506 observations on 14 variables, medv being the target variable:

crim per capita crime rate by town
zn proportion of residential land zoned for lots over 25,000 sq.ft
indus proportion of non-retail business acres per town
chas Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
nox nitric oxides concentration (parts per 10 million)
rm average number of rooms per dwelling
age proportion of owner-occupied units built prior to 1940
dis weighted distances to five Boston employment centres
rad index of accessibility to radial highways
tax full-value property-tax rate per USD 10,000
ptratio pupil-teacher ratio by town
b 1000(B - 0.63)^2 where B is the proportion of blacks by town
lstat percentage of lower status of the population
medv median value of owner-occupied homes in USD 1000's

The corrected data set has the following additional columns:

cmedv corrected median value of owner-occupied homes in USD 1000's
town name of town
tract census tract
lon longitude of census tract
lat latitude of census tract

Source

The original data have been taken from the UCI Repository Of Machine Learning Databases at

the corrected data have been taken from Statlib at

See Statlib and references there for details on the corrections. Both were converted to R format by Friedrich Leisch.

References

Harrison, D. and Rubinfeld, D.L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5, 81–102.

Gilley, O.W., and R. Kelley Pace (1996). On the Harrison and Rubinfeld Data. Journal of Environmental Economics and Management, 31, 403–405. [Provided corrections and examined censoring.]

Newman, D.J. & Hettich, S. & Blake, C.L. & Merz, C.J. (1998). UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.

Pace, R. Kelley, and O.W. Gilley (1997). Using the Spatial Configuration of the Data to Improve Estimation. Journal of the Real Estate Finance and Economics, 14, 333–340. [Added georeferencing and spatial estimation.]

Examples

1
2
3
4
5

Example output

      crim                zn             indus       chas         nox        
 Min.   : 0.00632   Min.   :  0.00   Min.   : 0.46   0:471   Min.   :0.3850  
 1st Qu.: 0.08204   1st Qu.:  0.00   1st Qu.: 5.19   1: 35   1st Qu.:0.4490  
 Median : 0.25651   Median :  0.00   Median : 9.69           Median :0.5380  
 Mean   : 3.61352   Mean   : 11.36   Mean   :11.14           Mean   :0.5547  
 3rd Qu.: 3.67708   3rd Qu.: 12.50   3rd Qu.:18.10           3rd Qu.:0.6240  
 Max.   :88.97620   Max.   :100.00   Max.   :27.74           Max.   :0.8710  
       rm             age              dis              rad        
 Min.   :3.561   Min.   :  2.90   Min.   : 1.130   Min.   : 1.000  
 1st Qu.:5.886   1st Qu.: 45.02   1st Qu.: 2.100   1st Qu.: 4.000  
 Median :6.208   Median : 77.50   Median : 3.207   Median : 5.000  
 Mean   :6.285   Mean   : 68.57   Mean   : 3.795   Mean   : 9.549  
 3rd Qu.:6.623   3rd Qu.: 94.08   3rd Qu.: 5.188   3rd Qu.:24.000  
 Max.   :8.780   Max.   :100.00   Max.   :12.127   Max.   :24.000  
      tax           ptratio            b              lstat      
 Min.   :187.0   Min.   :12.60   Min.   :  0.32   Min.   : 1.73  
 1st Qu.:279.0   1st Qu.:17.40   1st Qu.:375.38   1st Qu.: 6.95  
 Median :330.0   Median :19.05   Median :391.44   Median :11.36  
 Mean   :408.2   Mean   :18.46   Mean   :356.67   Mean   :12.65  
 3rd Qu.:666.0   3rd Qu.:20.20   3rd Qu.:396.23   3rd Qu.:16.95  
 Max.   :711.0   Max.   :22.00   Max.   :396.90   Max.   :37.97  
      medv      
 Min.   : 5.00  
 1st Qu.:17.02  
 Median :21.20  
 Mean   :22.53  
 3rd Qu.:25.00  
 Max.   :50.00  
                town         tract           lon              lat       
 Cambridge        : 30   Min.   :   1   Min.   :-71.29   Min.   :42.03  
 Boston Savin Hill: 23   1st Qu.:1303   1st Qu.:-71.09   1st Qu.:42.18  
 Lynn             : 22   Median :3394   Median :-71.05   Median :42.22  
 Boston Roxbury   : 19   Mean   :2700   Mean   :-71.06   Mean   :42.22  
 Newton           : 18   3rd Qu.:3740   3rd Qu.:-71.02   3rd Qu.:42.25  
 Somerville       : 15   Max.   :5082   Max.   :-70.81   Max.   :42.38  
 (Other)          :379                                                  
      medv           cmedv            crim                zn        
 Min.   : 5.00   Min.   : 5.00   Min.   : 0.00632   Min.   :  0.00  
 1st Qu.:17.02   1st Qu.:17.02   1st Qu.: 0.08204   1st Qu.:  0.00  
 Median :21.20   Median :21.20   Median : 0.25651   Median :  0.00  
 Mean   :22.53   Mean   :22.53   Mean   : 3.61352   Mean   : 11.36  
 3rd Qu.:25.00   3rd Qu.:25.00   3rd Qu.: 3.67708   3rd Qu.: 12.50  
 Max.   :50.00   Max.   :50.00   Max.   :88.97620   Max.   :100.00  
                                                                    
     indus       chas         nox               rm             age        
 Min.   : 0.46   0:471   Min.   :0.3850   Min.   :3.561   Min.   :  2.90  
 1st Qu.: 5.19   1: 35   1st Qu.:0.4490   1st Qu.:5.886   1st Qu.: 45.02  
 Median : 9.69           Median :0.5380   Median :6.208   Median : 77.50  
 Mean   :11.14           Mean   :0.5547   Mean   :6.285   Mean   : 68.57  
 3rd Qu.:18.10           3rd Qu.:0.6240   3rd Qu.:6.623   3rd Qu.: 94.08  
 Max.   :27.74           Max.   :0.8710   Max.   :8.780   Max.   :100.00  
                                                                          
      dis              rad              tax           ptratio     
 Min.   : 1.130   Min.   : 1.000   Min.   :187.0   Min.   :12.60  
 1st Qu.: 2.100   1st Qu.: 4.000   1st Qu.:279.0   1st Qu.:17.40  
 Median : 3.207   Median : 5.000   Median :330.0   Median :19.05  
 Mean   : 3.795   Mean   : 9.549   Mean   :408.2   Mean   :18.46  
 3rd Qu.: 5.188   3rd Qu.:24.000   3rd Qu.:666.0   3rd Qu.:20.20  
 Max.   :12.127   Max.   :24.000   Max.   :711.0   Max.   :22.00  
                                                                  
       b              lstat      
 Min.   :  0.32   Min.   : 1.73  
 1st Qu.:375.38   1st Qu.: 6.95  
 Median :391.44   Median :11.36  
 Mean   :356.67   Mean   :12.65  
 3rd Qu.:396.23   3rd Qu.:16.95  
 Max.   :396.90   Max.   :37.97  
                                 

mlbench documentation built on Jan. 29, 2021, 5:05 p.m.