mega_clean: Clean Williams Faculty Data Sets
In chenfr/williamsfacultyage: An Analysis of the Ages of Williams College Faculty

The "mega_clean" function cleans the Williams faculty data set for any year. In particular, the function cleans the "department" column of the data frame and compiles everything in a new dataframe called "mega_clean_df". Note that the data set must have the title of the professor or degree earned in the second column of the dataframe. ** The reason why we clean this is that the second column shows the title of the professor (e.g. Professor of Philosophy), which is unnecessary information that will confuse filter() and group_by() functions in dplyr. Our goal with this function is clean this column so that only the department name (e.g. Philosophy) remains. I recommend using read.csv("C:/Users/Frankie/Desktop/facultyData2013.txt", header=FALSE, fill=TRUE, na.string="NA") as the code to read in the raw data

1	mega_clean(y)

y

The dataset of Williams faculty to be cleaned by the function. Note that the dataset must have the department's name/ professor title in the second column. For an example, see "facultyData" for a model of the type of parameter mega_clean() expects. Examples include facultyData, facultyData2013, ect. facultyData2011-2014 have been preloaded for the user.

"mega_clean_df", the cleaned version of the parameter (the original dataframe)

chenfr/williamsfacultyage documentation built on May 13, 2019, 3:40 p.m.