predict_race | R Documentation |
predict_race
makes probabilistic estimates of individual-level race/ethnicity.
predict_race( voter.file, census.surname = TRUE, surname.only = FALSE, surname.year = 2010, census.geo, census.key = NULL, census.data = NULL, age = FALSE, sex = FALSE, year = "2010", party = NULL, retry = 3, impute.missing = TRUE, use.counties = FALSE, model = "BISG", race.init = NULL, name.dictionaries = NULL, names.to.use = "surname", control = NULL )
voter.file |
An object of class |
census.surname |
A |
surname.only |
A |
surname.year |
A number to specify the year of the census surname statistics.
These surname statistics is stored in the data, and will be automatically loaded.
The default value is |
census.geo |
An optional character vector specifying what level of
geography to use to merge in U.S. Census geographic data. Currently
|
census.key |
A character object specifying user's Census API
key. Required if |
census.data |
A list indexed by two-letter state abbreviations,
which contains pre-saved Census geographic data.
Can be generated using |
age |
An optional |
sex |
optional |
year |
An optional character vector specifying the year of U.S. Census geographic
data to be downloaded. Use |
party |
An optional character object specifying party registration field
in |
retry |
The number of retries at the census website if network interruption occurs. |
impute.missing |
Logical, defaults to TRUE. Should missing be imputed? |
use.counties |
A logical, defaulting to FALSE. Should census data be filtered by counties available in census.data? |
model |
Character string, either "BISG" (default) or "fBISG" (for error-correction, fully-Bayesian model). |
race.init |
Vector of initial race for each observation in voter.file.
Must be an integer vector, with 1=white, 2=black, 3=hispanic, 4=asian, and
5=other. Defaults to values obtained using |
name.dictionaries |
Optional named list of |
names.to.use |
One of 'surname', 'surname, first', or 'surname, first, middle'. Defaults to 'surname'. |
control |
List of control arguments only used when
|
This function implements the Bayesian race prediction methods outlined in Imai and Khanna (2015). The function produces probabilistic estimates of individual-level race/ethnicity, based on surname, geolocation, and party.
Output will be an object of class data.frame
. It will
consist of the original user-input voter.file
with additional columns with
predicted probabilities for each of the five major racial categories:
pred.whi
for White,
pred.bla
for Black,
pred.his
for Hispanic/Latino,
pred.asi
for Asian/Pacific Islander, and
pred.oth
for Other/Mixed.
#' data(voters) try(predict_race(voter.file = voters, surname.only = TRUE)) ## Not run: try(predict_race(voter.file = voters, census.geo = "tract", census.key = "...")) ## End(Not run) ## Not run: try(predict_race( voter.file = voters, census.geo = "place", census.key = "...", year = "2020")) ## End(Not run) ## Not run: CensusObj <- try(get_census_data("...", state = c("NY", "DC", "NJ"))) try(predict_race( voter.file = voters, census.geo = "tract", census.data = CensusObj, party = "PID") ) ## End(Not run) ## Not run: CensusObj2 <- try(get_census_data(key = "...", state = c("NY", "DC", "NJ"), age = T, sex = T)) try(predict_race( voter.file = voters, census.geo = "tract", census.data = CensusObj2, age = T, sex = T)) ## End(Not run) ## Not run: CensusObj3 <- try(get_census_data(key = "...", state = c("NY", "DC", "NJ"), census.geo = "place")) try(predict_race(voter.file = voters, census.geo = "place", census.data = CensusObj3)) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.