merge_surnames: Surname probability merging function.
In kosukeimai/wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames

R Documentation

Surname probability merging function.

Description

merge_surnames merges surnames in user-input dataset with corresponding race/ethnicity probabilities from U.S. Census Surname List and Spanish Surname List.

Usage

merge_surnames(
  voter.file,
  surname.year = 2020,
  name.data,
  clean.surname = TRUE,
  impute.missing = TRUE
)

Arguments

`voter.file`	An object of class `data.frame`. Must contain a field named 'surname' containing list of surnames to be merged with Census lists.
`surname.year`	An object of class `numeric` indicating which year Census Surname List is from. Accepted values are `2010` and `2000`. Default is `2020`.
`name.data`	An object of class `data.frame`. Must contain a leading column of surnames, and 5 subsequent columns, with Pr(Race \| Surname) for each of the five major racial categories.
`clean.surname`	A `TRUE`/`FALSE` object. If `TRUE`, any surnames in `voter.file` that cannot initially be matched to surname lists will be cleaned, according to U.S. Census specifications, in order to increase the chance of finding a match. Default is `TRUE`.
`impute.missing`	A `TRUE`/`FALSE` object. If `TRUE`, race/ethnicity probabilities will be imputed for unmatched names using race/ethnicity distribution for all other names (i.e., not on Census List). Default is `TRUE`.

Details

This function allows users to match surnames in their dataset with the U.S. Census Surname List (from 2000 or 2010) and Spanish Surname List to obtain Pr(Race | Surname) for each of the five major racial groups.

By default, the function matches surnames to the Census list as follows:

Search raw surnames in Census surname list;
Remove any punctuation and search again;
Remove any spaces and search again;
Remove suffixes (e.g., Jr) and search again;
Split double-barreled surnames into two parts and search first part of name;
Split double-barreled surnames into two parts and search second part of name;
For any remaining names, impute probabilities using distribution for all names not appearing on Census list.

Each step only applies to surnames not matched in a previous ste. Steps 2 through 7 are not applied if clean.surname is FALSE.

Note: Any name appearing only on the Spanish Surname List is assigned a probability of 1 for Hispanics/Latinos and 0 for all other racial groups.

Value

Output will be an object of class data.frame. It will consist of the original user-input data with additional columns that specify the part of the name matched with Census data (surname.match), and the probabilities Pr(Race | Surname) for each racial group (p_whi for White, p_bla for Black, p_his for Hispanic/Latino, p_asi for Asian and Pacific Islander, and p_oth for Other/Mixed). #'

Examples

data(voters)
## Not run: try(merge_surnames(voters))

kosukeimai/wru documentation built on June 18, 2024, 8:48 a.m.

kosukeimai/wru index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kosukeimai/wru
Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function.
In kosukeimai/wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

Surname probability merging function.

Description

Usage

Arguments

Details

Value

Examples

Related to merge_surnames in kosukeimai/wru...

R Package Documentation

Browse R Packages

We want your feedback!

kosukeimai/wru Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function. In kosukeimai/wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

Surname probability merging function.

Description

Usage

Arguments

Details

Value

Examples

Related to merge_surnames in kosukeimai/wru...

R Package Documentation

Browse R Packages

We want your feedback!

kosukeimai/wru
Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

merge_surnames: Surname probability merging function.
In kosukeimai/wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation