stuatt | R Documentation |
A synthetic dataset of student attributes from the Strategic Data Project which includes records with errors to practice data cleaning and implementing business rules for consistency in data.
stuatt
A data frame with 87534 observations on the following 9 variables.
sid
a numeric vector of the unique student ID
school_year
a numeric vector of the school year
male
a numeric vector indicating 1 = male
race_ethnicity
a factor with levels A
B
H
M/O
W
birth_date
a numeric vector of the student birthdate
first_9th_school_year_reported
a numeric vector of the first year a student is reported in 9th grade
hs_diploma
a numeric vector
hs_diploma_type
a factor with levels Alternative Diploma
College Prep Diploma
Standard Diploma
hs_diploma_date
a factor with levels 12/2/2008
12/21/2008
4/14/2008
4/18/2008
...
This is the non-clean version of the data to allow for implementing business rules to clean data.
Available from the Strategic Data Project online at https://sdp.cepr.harvard.edu/toolkit-effective-data-use
Visit the Strategic Data Project online at: https://sdp.cepr.harvard.edu/
data(stuatt)
head(stuatt)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.