adjust_park_factors: Adjust statistics for park effects

Description Usage Arguments Details Value Examples

View source: R/clean-data.R

Description

This takes a data.frame returned from clean_PIA and adjusts the statistics for park effects.

Usage

1
adjust_park_factors(stats, pfs, type = c("bat", "pit"), b_bio = NULL)

Arguments

stats

data.frame of player statistics. Obtained from clean_PIA

pfs

data.frame of multipliers for park factors. Must be in proper form.

type

character. Whether these are batting or pitching data. Defaults to batting.

b_bio

optional data.frame. Contains the handedness of batters. Used because park factors vary for left vs right handed batters.

Details

If type = "bat", then stats is left joined with b_bio and the two are joined by the MLBID column. This will add the Bats column to the stats data. For pitchers, the Bats column is set to be "Both" (i.e. they are treated as switch hitters).

Next, the statistics are turned into long format using gather from the tidyr package. If any park factors are missing, they are set to average (which is 100). Statistics are then adjusted using the formula:

adjusted = count / (PF / 200 + .5)

Extraneous columns are discarded then the data are returned to a wide format using spread from the tidyr package.

Value

tbl_df of statistics that have been adjusted for park factors.

Examples

1
2
3
4
5
6
7
curr_wd <- getwd()
setwd("N:/Apps/simScoresApp/data")
stats <- read.csv("1-cleaned/batters/pia.csv", header = T, stringsAsFactors = F)
pfs <- read.csv("manual-info/Park_Factors.csv", header = T, stringsAsFactors = F)
b_bio <- read.csv("manual-info/bio_bat.csv", header = T, stringsAsFactors = F)
x <- adjust_park_factors(stats, pfs, "bat", b_bio)
setwd(curr_wd)

guytuori/simScores documentation built on May 17, 2019, 9:29 a.m.