students: 10,000 Simulated Graduation Records

Description Usage Format Details Source See Also

Description

students is a simulated dataset containing 10,000 fabricated records of undergraduate students with degrees conferred by Georgia State University between Summer, 2007 and Summer, 2019. This dataset was designed to recreate a 10,000-record SQL query export and is comprised of randomly generated GPA, date of birth, degree, college, department, and graduation terms that redistribute unique values from 10,000 real graduation records.

Usage

1

Format

A data frame with 10,000 rows and 10 variables.

Details

Variables SEX, RACE_CODES, ETHNIC_CODES, and GRAD_GPA were simulated using a combination of random sampling of real, unique values with replacement and random number generation. GRAD_DPA is comprised of random values between 1.8 and 3.3. DOB is comprised of randomly generated dates ranging from 1970 to 2003.

Unique permutations of COLLEGE, DEPARTMENT, MAJOR, and DEGREE are preserved for realism and randomly sampled with replacement.

students$variable <- sample(students$variable, replace = FALSE)

Sampling and random number generation are reproducible with function set.seed().

Source

GSU Data Warehouse: edwprd.sdmcohortfr_us

See Also

unique, seq, sample, set.seed


jamisoncrawford/panthr documentation built on March 9, 2020, 6:18 p.m.