firstnames: Ethnorace distribution over first names.

Description Usage Format Source

Description

A data set containing columns for 1) the probability of first name given ethnorace, and 2) the probability of ethnorace given first name. First name is an uppercase character string. Laplace smoothing has been applied to this data set, meaning that 1 has been added to each ethnorace category per name This gives non-zero probability to all cells.

Usage

1

Format

A data frame with 4251 rows and 13 variables:

birth_year

numeric

pr_hispanic_f

Probability Hispanic given first name

...

pr_f_hispanic

Probability first name given Hispanic

...

Source

Tzioumis, Konstantinos (2018) Demographic aspects of first names, Scientific Data, 5:180025 dx.doi.org/10.1038/sdata.2018.25. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/TYJKEZ


bwilden/bperdata documentation built on Jan. 28, 2021, 1:41 p.m.