sanfrancisco.home.sales: San Franciscio Home Sales Data

Description Usage Format Details Source Examples

Description

This data contains information on homes sold in San Francisco between 2/13/2008 and 7/14/2009.

Usage

1

Format

A data frame with 3281 observations on the following 15 variables.

line

a numeric vector representing the line number of the observation in the data set

county

a factor with levels San Francisco County

street

a factor representing the street address of the property

city

a factor with levels San Francisco

zip

a numeric vector representing the zip code of the property

date

a Date representing the sale date

price

a numeric vector representing the sales price

bedrooms

a numeric vector representing the number of bedrooms

squarefeet

a numeric vector representing the interior are of the property, in square feet

lotsize

a numeric vector representing the lot size of the property, in square feet

year

a numeric vector representing the year in which the property was built

latitude

a numeric vector representing the lattitude coordinate of the property

longitude

a numeric vector representing the longitude coordinate of the property

month

a factor representing the month in which the property was sold

neighborhood

a factor representing neighborhood names

Details

This data set was assembled from a variety of sources, including two Bay area newspapers (the San Jose Mercury News and the San Francisco Chronicle), Yahoo Maps, and Zillow Neighborhood Boundaries.

This data set is used as an example in the book "R in a Nutshell" from O'Reilly Media. In the book, we took separate samples for training and testing. Indices for observations in each sample are included in sanfrancisco.home.sales.testing.indices and sanfrancisco.home.sales.training.indices.

Source

Data was assembled from a variety of sources including http://www.sfgate.com http://www.mercurynews.com http://www.zillow.com/howto/api/neighborhood-boundaries.htm

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
data(sanfrancisco.home.sales)
library(lattice)
trellis.par.set(fontsize=list(text=7))
dollars.per.squarefoot <- mean(
  sanfrancisco.home.sales$price / sanfrancisco.home.sales$squarefeet,
  na.rm=TRUE);
xyplot(price~squarefeet|neighborhood,
        data=sanfrancisco.home.sales,
        pch=19, 
        cex=.2,
        subset=(zip!=94100 & zip!=94104 & zip!=94108 & 
                zip!=94111 & zip!=94133 & zip!=94158 &
                price<4000000 &
                ifelse(is.na(squarefeet),FALSE,squarefeet<6000)),
        strip=strip.custom(strip.levels=TRUE, 
           horizontal=TRUE,
           par.strip.text=list(cex=.8)),
        panel=function(...) {
           panel.abline(a=0,b=dollars.per.squarefoot);
           panel.xyplot(...);
        }
)

Example output

Loading required package: nutshell.bbdb
Loading required package: nutshell.audioscrobbler

nutshell documentation built on May 1, 2019, 10:08 p.m.