Vignette Introduction"


These vignettes serve a dual purpose:


Vignettes completed to-date:

  1. Relationship Between Strikeouts and Home Runs -- This vignette looks at the relationship between rate of strikeouts and home runs from the year 1950+. This question was inspired by Marchi and Albert (2014), Analyzing Baseball Data in R.

  2. Run Scoring Trends -- Major League Baseball average per-game run scoring for each season since 1901.

  3. Team Payroll and the World Series -- This vignette examines whether there is a relationship between total team salaries (payroll) and World Series success.

Further reading

A number of books and on-line resources use the Lahman package as material for the examples. These include:


Michael Friendly and David Meyer (2016) Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data (CRC Press). DDAR Web Site

Max Marchi and Jim Albert (2014) Analyzing Baseball Data with R (CRC Press)

David Robinson (2017) Introduction to Empirical Bayes (published at [])

Hadley Wickham and Garrett Grolemund (2017) R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (O'Reilly)

Articles, blog entries, and course materials

Steven Buechler (2014-2015) Analysis of career performance in top home run hitters

Kris Eberwein (2015-09-30) "Hacking The New Lahman Package 4.0-1 with R-Studio" (via [])

Michael Lopez (2016) Lab materials for Skidmore College MA 276, “Sports and Statistics”

Bill Petti (2015-09-21) A Short(-ish) Introduction to Using R Packages for Baseball Research

Exploring Baseball Data with R blog

Jim Albert (2018-12-24) The Vanishing 300 Batting Average

Jim Albert (2015-01-05) A Graph of a Batting Average

Brian Mills (2014-09-30) Using ggmap and Lahman to Find the Hometown College Rosters


Try the Lahman package in your browser

Any scripts or data that you put into this service are public.

Lahman documentation built on April 9, 2021, 9:07 a.m.