parseSchools: Function to Match Error Prone Free-text to Standard School...
In brewdata: Extracting Usable Data from the Grad Cafe Results Search

Description Usage Arguments Value See Also Examples

parseSchools finds best matching school name among several possible spellings & abbreviations. Matches are based on a three stages of parsing: stage (1) standardizes the text by removing common typos and spelling errors, stage (2) manually searches for common name variations for the same school, stage (3) uses an automated text processing algorithm to match the closest school name from a standardized list.

1	parseSchools( original_name, resolution = 10, map=FALSE )

`original_name`	`original_name` denotes an Nx1 vector of university names read from the Grad Cafe.
`resolution`	`resolution` controls the precision required before an original name is replaced with the best standardized equivalent. Therefore, very low values (between 0-5) are cautious selections leading to fewer mis-matches, but more sparse results. Medium range values (8-12) lead to surprisingly accurate replacements when the mother processing stages fail. One might expect a few mis-matched name replacements, but the number of errors should be fairly low. Finally, large values (more than 20) practically guarantee that a school name which is not in our standard dictionary will be replaced with something. Be weary of such large selections; the potential for many mis-matched replacements is high. For the test set, the bulk of the nearest matchs were within 10 units of the original value. Almost none were larger than 30. The default value is 10.
`map`	`map` is a variable controlling whether or not the original school names are included in the data frame returned by brewdata(). If map=TRUE, then the returned data includes the parsed names as well as the original. The default value is map=FALSE.

school_name

is the name of the university corresponding to the row of data. parseSchools normalizes the names reported on the website.

findScorePercentile, parseResults, parseSchools, translateScore, getGradCafeData, getMaxPages

1
2
3

x = c( "university of california--berkeley","university of california--berkly", 
	"uc berkeley", "berkeley" )
parseSchools( x )

brewdata documentation built on May 29, 2017, 4:28 p.m.

brewdata index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

brewdata
Extracting Usable Data from the Grad Cafe Results Search

parseSchools: Function to Match Error Prone Free-text to Standard School...
In brewdata: Extracting Usable Data from the Grad Cafe Results Search

Description

Usage

Arguments

Value

See Also

Examples

Related to parseSchools in brewdata...

R Package Documentation

Browse R Packages

We want your feedback!

brewdata Extracting Usable Data from the Grad Cafe Results Search

parseSchools: Function to Match Error Prone Free-text to Standard School... In brewdata: Extracting Usable Data from the Grad Cafe Results Search

Description

Usage

Arguments

Value

See Also

Examples

Related to parseSchools in brewdata...

R Package Documentation

Browse R Packages

We want your feedback!

brewdata
Extracting Usable Data from the Grad Cafe Results Search

parseSchools: Function to Match Error Prone Free-text to Standard School...
In brewdata: Extracting Usable Data from the Grad Cafe Results Search