knitr::opts_chunk$set(fig.width = 6, fig.height = 5,
                      fig.align = 'center')

The earthquake package is a small R package for cleaning, timelining, and mapping NOAA Significant Earthquake data. This R package was built to satisfy the requirements of the Capstone project for the Coursera Mastering Software Development in R 5-course specialization.

Installing earthquake

To install the earthquake package, the user must install the devtools package. Then, to download and install the earthquake package, use the following commands:

devtools::install_github('ZYDI/Earthquake')
library(earthquake)

Necessary Packages

To work through these examples, the following packages need to be installed and loaded:

library(earthquake) 
library(dplyr)
library(ggplot2)
library(readr)
library(lubridate)

Loading Data

The NOAA Significant Earthquake dataset, as it existed on April, 2019, is provided with this package, using the data set name quakes. If you wish to use data updated after this date, please see the dataset documentation (?earthquake:quakes) for source.

Load and Clean Package Data

To load the quakes data, simply use the earthquake::quakes command (the earthquake:: invocation is necessary because R comes loaded with another data set called quakes).

Then call the eq_clean_data and eq_location clean command to "clean" some of the variables in the data set for use with the visualization tools in this package. The usage of these commands, plus the "tail" of the data set, are shown below.

quakes <- earthquake::quakes # loads quakes data with data set
quakes <- quakes %>%
  eq_clean_data() %>%
  eq_location_clean()
tail(quakes)

Single Step, Using Package Data

To save a couple steps of command typing, the eq_load_clean_data command will load the quakes data and do the cleaning steps all with a single command call:

quakes <- eq_load_clean_data()
tail(quakes)

Load and Clean Downloaded Data

If you wish to use NOAA Significant Earthquake data updated after April, 2019, please visit the NOAA Significant Earthquakes site at https://www.ngdc.noaa.gov/nndc/struts/form?t=101650&s=1&d=1 and download the data to your working directory. Then you may load and clean data for analysis using the following sequence of commands (replace filename with the location of your file):

filename <- system.file('extdata', 'earthquakes.txt', package = 'earthquake')
quakes_from_raw <- readr::read_delim(filename, delim = '\t')
quakes_from_raw_clean <- quakes_from_raw %>%
  eq_clean_data() %>%
  eq_location_clean()
tail(quakes_from_raw_clean)

Plot Timeline of Data

This package includes a "timeline" capability to visualize countries' significant earthquakes. The timeline, which is a ggplot2 geom called geom_timeline, when used correctly shows the timeline of a country's significant earthquakes, with points colored and sized by number of deaths and Richter scale strength, respectively. The timeline plots years on the x-axis and any number of countries stacked on the y-axis.

There is a second ggplot2 geom in this package, called geom_timeline_label, that will label the strongest earthquakes on the timeline for each country.

Additionally, a ggplot2 theme, theme_eq, is provided with this package, which will make your charts much more attractive (in our humble opinion!).

Timeline Geoms

Load clean data to be used for all charts below:

quakes <- eq_load_clean_data()

The following example shows how to make a simple timeline geom for a single country. Notice that the quakes data must be filtered by COUNTRY and DATE variables. For best results, you should use the following aes = variable combinations:

quakes %>%
  dplyr::filter(COUNTRY == 'USA') %>%
  dplyr::filter(DATE > '2000-01-01') %>%
  ggplot() +
  geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS,
                    size = EQ_PRIMARY)) +
  scale_size_continuous(name = 'Richter scale value') +
  scale_color_continuous(name = '# of Deaths')

To create a basic timeline with two countries, simply change how you filter the COUNTRY value:

quakes %>%
  dplyr::filter(COUNTRY %in% c('USA', 'UK')) %>%
  dplyr::filter(DATE > '2000-01-01') %>%
  ggplot() +
  geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS,
                    size = EQ_PRIMARY)) +
  scale_size_continuous(name = 'Richter scale value') +
  scale_color_continuous(name = '# of Deaths')

Timeline and Label Geoms

To add labels to the timelines, use the geom_timeline_label geom with the following aes = variable combinations:

quakes %>%
  dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>%
  dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>%
  ggplot() +
  geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS,
                    size = EQ_PRIMARY)) +
  geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY,
                         label = LOCATION_NAME, n_max = 5)) +
  scale_size_continuous(name = 'Richter scale value') +
  scale_color_continuous(name = '# of Deaths')

Timeline and Label Geoms with theme_eq

To make the charts look slightly more attractive, use the included ggplot2 theme: theme_eq.

quakes %>%
  dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>%
  dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>%
  ggplot() +
  geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS,
                    size = EQ_PRIMARY)) +
  geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY,
                         label = LOCATION_NAME, n_max = 5)) +
  scale_size_continuous(name = 'Richter scale value') +
  scale_color_continuous(name = '# of Deaths') +
  theme_eq()

Use Wrapper Function for Ease of Use

All the above can be an awful lot of typing to get these charts. So, this package includes a handy eq_timeline wrapper function to make a nice labeled (or not!) chart with the default aes values and theme selection. This is much easier than typing in all of the above.

One Country, No Labels

Here is an example with one country and no labels on the earthquakes.

quakes %>% eq_timeline(countries = 'NEW ZEALAND', 
                       date_min = as.Date('1995-01-01'),
                       date_max = as.POSIXct('2015-01-01'), 
                       label_n = 0)

Multiple Countries, Labels

Here is an example of multiple countries and up to 5 labels per country.

quakes %>% eq_timeline(countries = c('NEW ZEALAND', 'HAITI'),
                       date_min = '2000-01-01', 
                       date_max = '2015-01-01',
                       label_n = 5)

Plot Map

Finally, this package includes functions for creating an interactive map of the earthquakes using the leaflet package.

Basic Annotations

Call the eq_map function with cleaned and filtered data, and specify an annotation column with the argument annot_col:

quakes %>%
  dplyr::filter(COUNTRY == 'JAPAN') %>%
  dplyr::filter(lubridate::year(DATE) >= 2000) %>%
  eq_map(annot_col = 'DATE')

More Useful Annotations

If you'd like more useful annotations, first call quakes <- quakes %>% dplyr::mutate(popup_text = eq_create_label(.)) to create a data column with formatted HTML for more useful quakes information, and then call the eq_map function:

quakes %>%
  dplyr::filter(COUNTRY == 'MEXICO') %>%
  dplyr::filter(lubridate::year(DATE) >= 2000) %>%
  dplyr::mutate(popup_text = eq_create_label(.)) %>%
  eq_map(annot_col = 'popup_text')


DYZI/Earthquake_doc documentation built on May 27, 2019, 2:05 p.m.