knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "##",
    dpi = 250,
    fig.path = "man/images/"
)

iebabynames: Full baby name data for Ireland

library(iebabynames)

Description

Full baby name data (1964--2023) for Ireland, gathered from the Central Statistics Office.

The package contains the dataset iebabynames with r format(nrow(iebabynames), big.mark = ",") observations on six variables: year, sex, name, n, rank, and prop, and prop_sex. Due to confidentiality reasons, only names with 3 or more instances in the relevant year are included.

The package can be used to explore patterns of baby names in Ireland over time. The dataset iebabynames is also very suitable for filtering, summarising and plotting variables in workshops or lectures.

The structure of the package follows the babynames package by Hadley Wickham and the ukbabynames package by Mine Çetinkaya-Rundel, Thomas Leeper, and Nicholas Goguen-Compagnoni.

Installation

The package is hosted on GitHub and not available at CRAN. To install the latest development version:

if (!require("remotes")) {
    install.packages("remotes")
}
remotes::install_github("stefan-mueller/iebabynames") 

Demonstration

# load packages
library(iebabynames)
library(dplyr)
library(ggplot2)
library(scales)

# set ggplot2 theme
theme_set(theme_bw())

head(iebabynames)

Get most popular names in 2023

dat_top_2023 <- iebabynames %>% 
    filter(year == "2023") %>% 
    group_by(sex) %>% 
    top_n(n = 10, wt = -rank) # get top 10

dat_top_2023

Plotting the 10 most frequent male and female names across the entire period

iebabynames_top <- iebabynames %>% 
    group_by(sex, name) %>% 
    summarise(n_total = sum(n)) %>% 
    top_n(n = 10, wt = n_total)

ggplot(iebabynames_top, aes(x = n_total,
                            y = reorder(name, n_total), 
                            fill = sex)) +
    geom_bar(stat = "identity") +
    geom_text(aes(label = name), nudge_x = -2000, 
              hjust = "right",
              colour = "white") +
    scale_x_continuous(labels = scales::comma_format()) +
    scale_fill_manual(values = c("#83D0F5", "#71B294")) +
    labs(title = "Frequency of Baby Names in Ireland (1964–2023)", 
         x = "Number of Babies",
         y = NULL) +
    theme(axis.text.y = element_blank(),
          axis.ticks.y = element_blank(),
          legend.title = element_blank(),
          legend.position = "bottom") 

Inspecting the development of selected names

ggplot(data = filter(iebabynames, name %in% c("John", "Mary")),
       aes(x = year, y = prop)) +
    geom_smooth(se = FALSE) +
    scale_x_continuous(breaks = c(seq(1963, 2023, 10))) +
    scale_y_continuous(labels = scales::percent,
                       breaks = c(seq(0, 0.06, 0.01))) +
    geom_point(alpha = 0.4) +
    facet_wrap(~name) +
    labs(x = NULL, y = "Percentage of Babies") 

Explore different variants of names

iebabynames_variants <- iebabynames %>% 
    filter(name %in% c("Aoife", "Aoibhe", "Eva",
                       "Eve"))

ggplot(data = iebabynames_variants,
       aes(x = year, y = n)) +
    geom_smooth(se = FALSE) +
    scale_x_continuous(breaks = c(seq(1963, 2023, 20))) +
    geom_point(alpha = 0.4) +
    facet_wrap(~name, nrow = 1) +
    labs(x = NULL, y = "Frequency") +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5))

The website of the Central Statistics Office includes an interactive interface that allows you to plot the frequency and rank of custom names.

How to cite

Stefan Müller (2024). iebabynames: Ireland Baby Names, 1964--2023. R package version 0.2.4. URL: http://github.com/stefan-mueller/iebabynames.

If you use the data, please also cite the CSO website: https://www.cso.ie/en/interactivezone/visualisationtools/babynamesofireland/.

Issues

Please file an issue (with a bug, wish list, etc.) via GitHub.



stefan-mueller/iebabynames documentation built on April 10, 2024, 7:18 a.m.