knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

nameage

Description

This packages uses the U.S. Social Security Administration's baby names dataset and actuarial tables to estimate the age of an American based on their first name. It uses datasets conveniently collected in the babynames package and follows the same general format as the gender package.

The most famous example of using names to estimate age probably comes from FiveThirtyEight.

Installation

To install from Github, use the following commands.

# install.packages("devtools")
devtools::install_github("andland/nameage")

Using the package

The main function is nameage() which takes a vector of names as the first argument. There are two additional arguments:

The function returns a data frame with a row for each name it can find. It includes a summary of the age distribution, including the mean, standard deviation, first quartile, median, and third quartile. In addition, it includes the number of people born with the names, as well as an estimate of the number of people still alive at the reference year.

To start off, we will get the age of some names as of 2015. The names argument is not case sensitive.

library(nameage)
names = c("Ava", "liam", "Jack", "ELLA", "gertrude", "elmer", "Violet")

nameage(names, base_year = 2015)

The average age of people with a given name changes depending on the effective year. People named Violet were in general much older in 1990 than they are today.

nameage(names, base_year = 1990)

Looking at just working adults.

nameage(names, base_year = 2015, age_range = c(18, 65))

The package also includes a function to plot the distribution of the ages for each name. In addition to a few arguments to control the plotting, there is an additional parameter type which tells whether to plot by age...

plot_nameage(c("Joseph", "Anna"), type = "age")

or by year.

plot_nameage(c("Joseph", "Anna"), type = "year")

To do



andland/nameage documentation built on May 7, 2019, 8:57 p.m.