This package contains scripts for administering the holistic review process in graduate admissions. Holistic review entails evaluating the "total applicant" - that is, the experiences, abilities, and contributions of the person applying, rather than a test score or two. Ultimately this package will include functions for reviewer assignments and applicant ranking.
Two populations underlie any admissions scenario: Reviewers are the people who read through application materials and score each application. Applicants comprise the full body of students who applied to join the program in a particular calendar year. In the Graduate Group in Ecology at UC Davis (GGE), faculty and current graduate students can be reviewers.
ggeAdmit
is the product of years of work by the Admissions and Awards subcommittee of the GGE's Diversity Committee. The current version of the workflow would not have been possible without the help of generations of both students and faculty in numerous departments affiliated with the GGE.
To install, use the devtools
package and clone from github. This may be possible using devtools:
devtools::install_github("ggeDCAA/ggeAdmit")
You may run into an error about package dependencies; you can install dependencies using install.packages("AlgDesign")
and install.packages("dplyr")
if you don't already have them.
If the devtools approach fails, installation can also be done using a command line approach. There are different protocols for Mac and Windows.
$ git clone https://github.com/ggeDCAA/ggeAdmit
$ R CMD Build ggeAdmit
$ R CMD INSTALL ggeAdmit_0.0.1.9000.tar.gz
>git clone https://github.com/ggeDCAA/ggeAdmit
>R CMD INSTALL --build ggeAdmit
If properly installed, the package can then be imported in R using:
library(ggeAdmit)
Reviewers assignment uses an incomplete block design to statistically "spread" reviewers across applicants. Because the GGE allows both faculty and current graduate students to serve as reviewers, we suggest performing two reviewer assignment steps: one series of matching faculty reviewers with applicants, and one series matching graduate student reviewers with applicants. This spreads the influence of faculty vs student reviews approximately evenly across the applicant population.
Reviewer assignment can follow a relatively straightforward workflow. First, import a reviewers and applicants dataset. The reviewers dataset should columns with Name ("Lastname, Firstname"), Email, and type (type indicates "Student" or "Faculty"). The applicants dataset should contain columns Name
("Lastname, Firstname"), Email
, and then 6 columns for Faculty Members with whom applicants are interested in working (each called Faculty.Member.i
where i takes a value from 1 to 6). Note: The names for Faculty.Member.i
should be in the format "Lastname, Firstname" and those names MUST match the names in the reviewers dataset. For example, a link between Reviewer Charles Darwin and Applicant's Faculty.Member.1 interest "Darwin, Charlie" cannot be inferred by ggeAdmit. For your convenience, we include a sample dataset of reviewers and applicants (all names and email addresses are random combinations of letters).
## Import data
# Access built-in sample data. All installations of ggeAdmit should
# put the sample data in the same location, which can be identified
# using the `system.file()` function here.
reviewersFP = system.file("extdata",
"sampleReviewers.csv",
package = "ggeAdmit")
applicantsFP = system.file("extdata",
"sampleApplicants.csv",
package = "ggeAdmit")
reviewers = read.csv(reviewersFP, stringsAsFactors = FALSE)
applicants = read.csv(applicantsFP, stringsAsFactors = FALSE)
A demographic summary is required to inform the algorithmic incomplete block design. Specifically, we need to know how many reviewers in each type there are, and how many applicants comprise the applicant pool. Every application is reviewed!
## Demographic summary - necessary for incomplete block design
# How many reviewers are there?
n.sr <- sum(reviewers$type == "Student")
n.fr <- sum(reviewers$type == "Faculty")
# How many applicants are there?
n.applics <- length(unique(applicants$Name))
# Applicant names (we will need this down the line)
applicantNames = applicants$Name
Before getting to reviewer assignments, it is important to verify that name conventions used by faculty who signed up to review follow the same approach as the names indicated by applicants as potential faculty of interest. Use the checkPI()
function to verify that faculty of interest listed by applicants are present in the reviewers dataset. If checkPI()
returns NULL
, all faculty of interest are present in the faculty sign-up list. If some faculty are "missing" (or misspelled, etc.) a vector of missing persons is returned.
checkPI(reviewersDF = reviewers, applicantsDF = applicants)
Some faculty of interest are not present in reviewers dataset
[1] "Eilers, Mike" "Ali, Matthwe"
Here, we can see that two names listed as faculty of interest by applicants are not present in the reviewer sign-up dataset. This reveals two potential issues: First, the GGE requires that faculty who are accepting students must sign up to review applications during that year's admissions cycle. If a listed faculty of interest is not a reviewer (but plans on accepting students), they are breaking rules! Second, there may be spelling or name convention issues, which is what appears to be the case here. You will have to sift through the applicants
dataset and manually correct mismatched names. After taking a closer look, we find that Mike Eilers used "Michael" when he signed up to review, and one applicant used the more common "Mike" when listing that faculty member on their application. We can also see that in one case, an applicant misspelled the name of a potential faculty of interest, Matthew Ali.
Use assignReviewers()
to pair reviewers with applicants. Recall that we suggest doing this separately for Student reviewers and Faculty reviewers. For GGE reviewer assignments, it is likely that assignReviewers()
will return a warning message:
"In assignReviewers(number.alternatives = n.applics, number.blocks = n.sr, :It is recommended that number.blocks >= 3 * number.alternatives / alternatives.per.block"
Because of how many applications need to be reviewed, and how many people typically sign up to review, and how many times applicants will be reviewed, and how many applications we can reasonably expect a reviewer to review, it's usually impossible to achieve this recommendation. Optimal designs can be found more quickly when that condition is satisfied, but we have to get by with the resources that are available to us.
The incomplete block design takes a long time to complete! For an actual admissions scenario we recommend at least 200-300 replicates based on an analysis of assignment optimality across different replicate sizes. The default value for nReps
is 300. However, to simplify and expedite the example code, we set nReps = 10
here.
## Run the incomplete block design for grad student reviewers
student.design <- assignReviewers(number.alternatives = n.applics,
number.blocks = n.sr,
nReps = 10)
## Run the incomplete block design for faculty reviewers
faculty.design <- assignReviewers(number.alternatives = n.applics,
number.blocks = n.fr,
nReps = 10)
assignReviewers()
returns a list of 6 elements. For the simple purpose of assigning reviewers to applicants, the element $design
is of the most immediate importance. Use the function getAssignments()
to extract identities of applicants ("Alternatives") and reviewers ("Blocks") from the incomplete block design.
## Extract identifying information from the incomplete block design reviewer assignment
s.des.names = getAssignments(x = student.design,
appNames = applicantNames)
f.des.names = getAssignments(x = faculty.design,
appNames = applicantNames)
Compile the reviewer assignments to a single cohesive datset.
## Combine Student and Faculty outputs to generate the complete reviewer assignments design
complete.design = combineOutputs(reviewers = reviewers,
assignments1 = s.des.names,
assignments2 = f.des.names)
Checking for conflicts of interest is post-hoc and can be useful for identifying cases where manual shifting of reviewer assignments may be required. To check for conflicts of interest:
## Compile all conflicts of interest
library(dplyr)
conflictsOfInterest = applicants %>%
select(Name,
`Faculty.Member.1`, `Faculty.Member.2`,
`Faculty.Member.3`, `Faculty.Member.4`,
`Faculty.Member.5`, `Faculty.Member.6`)
## Identify cases where COI pairs were assigned by the incomplete block design
conflictsSummary = summarizeConflicts(COIs = conflictsOfInterest,
ASSIGNs = complete.design)
conflictsSummary
Other summaries of the data can be useful to inspect. For example, tabulate the number of student and faculty reviewers assigned to each applicant:
## Summarize the reviewer count per application
n.revs.per.app <- data.frame(appID = 1:n.applics,
appName = applicantNames,
student.revs = apply(student.design$binary.design,1,sum),
faculty.revs = apply(faculty.design$binary.design,1,sum))
Don't forget to save outputs! assignReviewers()
takes a long time and it would be a shame to lose all of your hard work.
## Save outputs
write.table(complete.design, "Assignments_GGE.csv",
row.names = FALSE, col.names = TRUE, sep = ",")
write.table(conflictsOfInterest, "applicant_COIs.csv",
row.names = FALSE, col.names = TRUE, sep = ",")
write.table(conflictsSummary, "Assignments_COIs.csv",
row.names = FALSE, col.names = TRUE, sep = ",")
write.table(n.revs.per.app, "n_revs_per_app.csv",
row.names = FALSE, col.names = TRUE, sep = ",")
Applicants are ranked using the holistic scores generated by faculty and student reviewers. This component of the ggeAdmit
library is still in development.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.