identify_missing: Find Missing Data

Description Usage Arguments Value Examples

View source: R/clean-data.R

Description

Identifies players with missing data. Records these players and saves files with these players' MLBID so that missing data may be filled in by the user at a later time.

Usage

1
identify_missing(dat, ignore)

Arguments

dat

data.frame. This contains both statistics and bio data so that missing data may be identified by the user. Obtained from combine.

ignore

data.frame. This contains MLBID that the user wants to ignore when identifying missing data. Must be a data.frame with a column named MLBID

Value

a data.frame containing bio info for any player with missing data. Most of the time, it should just be bio data that is missing. If however, a batting/pitching statistic is missing, the player with the missing data will be identified.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
curr_wd <- getwd()
setwd("N:/Apps/simScoresApp/data")
stats <- read.csv("4-mles/bat-mles.csv", header = T, stringsAsFactors = F) %>% tbl_df() %>% mutate(MLBID = as.character(MLBID))
bio <- read.csv("manual-info/bio_bat.csv", header = T, stringsAsFactors = F) %>% tbl_df() %>% mutate(MLBID = as.character(MLBID))
pos <- read.csv("manual-info/positions.csv", header = T, stringsAsFactors = F) %>% tbl_df() %>% mutate(MLBID = as.character(MLBID))
dl <- read.csv("manual-info/injuries.csv", header = T, stringsAsFactors = F) %>% tbl_df() %>% mutate(MLBID = as.character(MLBID))
ignore <- read.csv("manual-info/players_to_ignore.csv", header = T, stringsAsFactors = F) %>% tbl_df() %>% mutate(MLBID = as.character(MLBID))
x <- combine(stats, bio, pos, dl)
m <- identify_missing(x, ignore)
setwd(curr_wd)

guytuori/simScores documentation built on May 17, 2019, 9:29 a.m.