knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

This vignette illustrate the use of the dataset and utility functions included in the package packr. I collected this data set initially to use in my course GEOG 3LT3: Transportation Geography. As part of this course, students examine some trends in transportation, including the use of energy and emissions. The objective of the practice is two-fold:

  1. On the side of technology, the students are learning to work with R Notebooks and R. For this reason, all code is documented so that the students can see how things are done.

  2. On the side of transportation geography, the students are learning to discern trends in transportation.

Preliminaries

Load the packages used in this vignette:

library(packr)

Loading the data

To load the data, use the function data():

data("energy_and_emmisions")

To inspect the dataframe, use the function summary()

summary(energy_and_emissions)

The data frame consists of 10 variables. The variable definitions can be consulted in the help file:

?energy_and_emissions

Are population and oil consumption related?

The dataframe includes information on population, GDP per capita, energy consumption, and emissions for world countries. The consumption of energy (in barrels per day) is for the country. We can plot these two variables to see if there is a trend. We create a scatterplot with x = Population and y = bblpd, so that the values of population are mapped to the x-axis, and the values of energy consumption are mapped to the y-axis:

# Simple Scatterplot
plot(energy_and_emissions$Population,
     energy_and_emissions$bblpd, 
     main="Scatterplot Example",
     xlab="Population ", 
     ylab="Barrels of oil per day ", 
     pch=19)

Not suprisingly, there is a strong association between these two variables, since countries with big populations will consume more energy than small countries with small populations. This is not very informative, because the underlying relationship is simply size.

What is the per capita consumption of oil by country?

Instead of exploring energy consumption by population, we will look at energy consumption per capita. This is a more informative variable, because it normalizes by size, and potentially can tell us something about the intensity and/or efficiency of energy use. However, energy consumption per capita is not one of the variables in the dataset. We need to divide the variable bblpd by Population to add this variable to the dataframe:

energy_and_emissions$EPC <- energy_and_emissions$bblpd/energy_and_emissions$Population

Check the descriptive statistics of EPC (energy consumption in barrels per day per person):

summary(energy_and_emissions$EPC)

The maximum consumption is approximately r round(max(energy_and_emissions$EPC), 2) barrels per person per day. Which country is that?

energy_and_emissions[energy_and_emissions$EPC == max(energy_and_emissions$EPC), "Country"]

The country with the highest per capita oil consumption in the world according the the data is Singapore.

Are GDP per capita and energy consumption per capita related?

To answer this question, we can create a scatterplot of the two variables:

plot(energy_and_emissions$GDPPC, 
     energy_and_emissions$EPC, 
     main="Scatterplot Example",
     xlab="GDP per capita ", 
     ylab="Energy consumption per capita (bbpd/population) ", 
     pch=19)

Calculate the correlation between these two variables:

cor(energy_and_emissions$GDPPC, energy_and_emissions$EPC)

There is a moderately strong correlation between these two variables.

What do we learn from this analysis? And how would you extend this analysis?



paezha/packr documentation built on Oct. 25, 2024, 8:16 p.m.