Schools: Schools table

Description Usage Format Source Examples

Description

Information on schools players attended, by school

Usage

1

Format

A data frame with 749 observations on the following 5 variables.

schoolID

school ID code

schoolName

school name

schoolCity

city where school is located

schoolState

state where school's city is located

schoolNick

nickname for school's baseball team

Source

Lahman, S. (2014) Lahman's Baseball Database, 1871-2013, 2014 version, http://baseball1.com/statistics/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
require(plyr)

# how many different schools are listed in each state?
table(Schools$schoolState)

# top 20 schools 
schoolInfo <- Schools[, c("schoolID", "schoolName", "schoolCity", "schoolState")]

schoolCount <- ddply(SchoolsPlayers, .(schoolID), summarise,
                       players = length(schoolID))
schoolCount <- merge(schoolCount, schoolInfo, by="schoolID", all.x=TRUE)

# Arrange in decreasing order:
schoolCount <- arrange(schoolCount, desc(players))
head(schoolCount, 20)

# sum counts by state
schoolStates <- ddply(schoolCount, .(schoolState), summarise,
                       players = sum(players),
                       schools = length(schoolState))
str(schoolStates)
summary(schoolStates)

## Not run: 
if(require(zipcode)) {
  # in lieu of more precise geocoding via schoolName, 
  # find lat/long of Schools from zipcode file
  zips <- ddply(zipcode, .(city, state), summarize,
                latitude=mean(latitude), longitude=mean(longitude))
  colnames(zips)[1:2] <- c("schoolCity", "schoolState")
  str(zips)

  # merge lat/long from zips
  schoolsXY <- merge(Schools, zips, by=c("schoolCity", "schoolState"), all.x=TRUE)
  str(schoolsXY)

  # plot school locations
  with(subset(schoolsXY, schoolState != 'HI'),
    plot(jitter(longitude), jitter(latitude))
    )
  }

## End(Not run)

Lahman documentation built on May 2, 2019, 5:25 p.m.