merge_surnames: Surname probability merging function.
In wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames

R Documentation

Surname probability merging function.

Description

merge_surnames merges surnames in user-input dataset with corresponding race/ethnicity probabilities from U.S. Census Surname List and Spanish Surname List.

Usage

merge_surnames(
  voter.file,
  surname.year = 2020,
  name.data,
  clean.surname = TRUE,
  impute.missing = TRUE
)

Arguments

`voter.file`	An object of class `data.frame`. Must contain a field named 'surname' containing list of surnames to be merged with Census lists.
`surname.year`	An object of class `numeric` indicating which year Census Surname List is from. Accepted values are `2010` and `2000`. Default is `2020`.
`name.data`	An object of class `data.frame`. Must contain a leading column of surnames, and 5 subsequent columns, with Pr(Race \| Surname) for each of the five major racial categories.
`clean.surname`	A `TRUE`/`FALSE` object. If `TRUE`, any surnames in `voter.file` that cannot initially be matched to surname lists will be cleaned, according to U.S. Census specifications, in order to increase the chance of finding a match. Default is `TRUE`.
`impute.missing`	A `TRUE`/`FALSE` object. If `TRUE`, race/ethnicity probabilities will be imputed for unmatched names using race/ethnicity distribution for all other names (i.e., not on Census List). Default is `TRUE`.

Details

This function allows users to match surnames in their dataset with the U.S. Census Surname List (from 2000 or 2010) and Spanish Surname List to obtain Pr(Race | Surname) for each of the five major racial groups.

By default, the function matches surnames to the Census list as follows:

Search raw surnames in Census surname list;
Remove any punctuation and search again;
Remove any spaces and search again;
Remove suffixes (e.g., Jr) and search again;
Split double-barreled surnames into two parts and search first part of name;
Split double-barreled surnames into two parts and search second part of name;
For any remaining names, impute probabilities using distribution for all names not appearing on Census list.

Each step only applies to surnames not matched in a previous ste. Steps 2 through 7 are not applied if clean.surname is FALSE.

Note: Any name appearing only on the Spanish Surname List is assigned a probability of 1 for Hispanics/Latinos and 0 for all other racial groups.

Value

Output will be an object of class data.frame. It will consist of the original user-input data with additional columns that specify the part of the name matched with Census data (surname.match), and the probabilities Pr(Race | Surname) for each racial group (p_whi for White, p_bla for Black, p_his for Hispanic/Latino, p_asi for Asian and Pacific Islander, and p_oth for Other/Mixed). #'

Examples

data(voters)
## Not run: try(merge_surnames(voters))

wru documentation built on May 29, 2024, 9:46 a.m.

wru index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

wru
Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function.
In wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

Surname probability merging function.

Description

Usage

Arguments

Details

Value

Examples

Related to merge_surnames in wru...

R Package Documentation

Browse R Packages

We want your feedback!

wru Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function. In wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

Surname probability merging function.

Description

Usage

Arguments

Details

Value

Examples

Related to merge_surnames in wru...

R Package Documentation

Browse R Packages

We want your feedback!

wru
Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function.
In wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation