knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

canlang

R build status R build status

The goal of {canlang} is to easily share language data collected in the 2016 Canadian census. This data was retreived from the 2016 Canadian census data set using the {cancensus} R package.

This package contains three data sets:

  1. can_lang: Contains the counts of the total number of Canadians that report each language as their mother tongue, which language they speak most often at home, which language they use most often at work, and which language they have knowledge for.

  2. region_lang: For each census division, it contains the counts of how many Canadians report each language as their mother tongue, which language they speak most often at home, which language they use most often at work, and which language they have knowledge for.

  3. region_data: For each census division, it contains the statistics for number of households, land area, population and number of dwellings.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("ttimbers/canlang")

Example usage of can_lang

The data set can_lang contains the counts of the total number of Canadians that report each language as their mother tongue, which language they speak most often at home, which language they use most often at work, and which language they have knowledge for. This data was recorded in the 2016 Census:

library(canlang)
head(can_lang)
library(ggplot2)
ggplot2::ggplot(data = can_lang,
       aes(x = most_at_home, y = mother_tongue, 
           colour = category, shape = category)) + 
    geom_point(alpha = 0.7) +
    scale_color_manual(values = c("blue3","red3","black")) +
    scale_y_log10(name = "Number of Canadians reporting the \n language as their mother tongue", 
                       labels = scales::comma) +
    scale_x_log10(name = "Number of Canadians speaking the language \n as their primary language at home", 
                       labels = scales::comma) +
    annotation_logticks() +
    theme_bw()

Example usage of region_lang

For each census metropolitan area (CMA), the data set region_lang contains the counts of how many Canadians report each language as their mother tongue, which language they speak most often at home, which language they use most often at work, and which language they have knowledge for.

library(canlang)
library(dplyr)
region_lang %>% 
    filter(region == "Vancouver") %>% 
    arrange(desc(mother_tongue)) %>% 
    head()

Example usage of region_data

For each census metropolitan area (CMA), the data set region_data contains the statistics for number of households, land area, population and number of dwellings.

library(canlang)
library(dplyr)
region_data %>% 
    arrange(desc(population)) %>% 
    head()

Plain text, excel and SQLite database files

We have included several different plain text files, an excel files and a SQLite database file in this repo to be used for practice importing from these filetypes. Specifically, they are:

Canada-level

Census metroolitan area (CMA)-level

How this was made

The data-raw directory contains the the scripts necessary to create everything in this package, including the R data objects and the plain text, excel and SQLite database files.

References

Data originally published in:

Package development resources:



ttimbers/canlang documentation built on Sept. 8, 2020, 11:46 a.m.