knitr::opts_chunk$set( collapse = TRUE, comment = "##", dpi = 250, fig.path = "man/images/" )
library(iebabynames)
Full baby name data (1964--2023) for Ireland, gathered from the Central Statistics Office.
The package contains the dataset iebabynames
with r format(nrow(iebabynames), big.mark = ",")
observations on six variables: year
, sex
, name
, n
, rank
, and prop
, and prop_sex
. Due to confidentiality reasons, only names with 3 or more instances in the relevant year are included.
The package can be used to explore patterns of baby names in Ireland over time. The dataset iebabynames
is also very suitable for filtering, summarising and plotting variables in workshops or lectures.
The structure of the package follows the babynames package by Hadley Wickham and the ukbabynames package by Mine Çetinkaya-Rundel, Thomas Leeper, and Nicholas Goguen-Compagnoni.
The package is hosted on GitHub and not available at CRAN. To install the latest development version:
if (!require("remotes")) { install.packages("remotes") } remotes::install_github("stefan-mueller/iebabynames")
# load packages library(iebabynames) library(dplyr) library(ggplot2) library(scales) # set ggplot2 theme theme_set(theme_bw()) head(iebabynames)
dat_top_2023 <- iebabynames %>% filter(year == "2023") %>% group_by(sex) %>% top_n(n = 10, wt = -rank) # get top 10 dat_top_2023
iebabynames_top <- iebabynames %>% group_by(sex, name) %>% summarise(n_total = sum(n)) %>% top_n(n = 10, wt = n_total) ggplot(iebabynames_top, aes(x = n_total, y = reorder(name, n_total), fill = sex)) + geom_bar(stat = "identity") + geom_text(aes(label = name), nudge_x = -2000, hjust = "right", colour = "white") + scale_x_continuous(labels = scales::comma_format()) + scale_fill_manual(values = c("#83D0F5", "#71B294")) + labs(title = "Frequency of Baby Names in Ireland (1964–2023)", x = "Number of Babies", y = NULL) + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.title = element_blank(), legend.position = "bottom")
ggplot(data = filter(iebabynames, name %in% c("John", "Mary")), aes(x = year, y = prop)) + geom_smooth(se = FALSE) + scale_x_continuous(breaks = c(seq(1963, 2023, 10))) + scale_y_continuous(labels = scales::percent, breaks = c(seq(0, 0.06, 0.01))) + geom_point(alpha = 0.4) + facet_wrap(~name) + labs(x = NULL, y = "Percentage of Babies")
iebabynames_variants <- iebabynames %>% filter(name %in% c("Aoife", "Aoibhe", "Eva", "Eve")) ggplot(data = iebabynames_variants, aes(x = year, y = n)) + geom_smooth(se = FALSE) + scale_x_continuous(breaks = c(seq(1963, 2023, 20))) + geom_point(alpha = 0.4) + facet_wrap(~name, nrow = 1) + labs(x = NULL, y = "Frequency") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
The website of the Central Statistics Office includes an interactive interface that allows you to plot the frequency and rank of custom names.
Stefan Müller (2024). iebabynames: Ireland Baby Names, 1964--2023. R package version 0.2.4. URL: http://github.com/stefan-mueller/iebabynames.
If you use the data, please also cite the CSO website: https://www.cso.ie/en/interactivezone/visualisationtools/babynamesofireland/.
Please file an issue (with a bug, wish list, etc.) via GitHub.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.