knitr::opts_chunk$set(fig.width = 6, fig.height = 5, fig.align = 'center')
The earthquake package is a small R package for cleaning, timelining, and mapping NOAA Significant Earthquake data. This R package was built to satisfy the requirements of the Capstone project for the Coursera Mastering Software Development in R 5-course specialization.
To install the earthquake package, the user must install the devtools package. Then, to download and install the earthquake package, use the following commands:
devtools::install_github('ZYDI/Earthquake') library(earthquake)
To work through these examples, the following packages need to be installed and loaded:
library(earthquake) library(dplyr) library(ggplot2) library(readr) library(lubridate)
The NOAA Significant Earthquake dataset, as it existed on April, 2019, is provided with this package, using the data set name quakes. If you wish to use data updated after this date, please see the dataset documentation (?earthquake:quakes) for source.
To load the quakes data, simply use the earthquake::quakes command (the earthquake:: invocation is necessary because R comes loaded with another data set called quakes).
Then call the eq_clean_data and eq_location clean command to "clean" some of the variables in the data set for use with the visualization tools in this package. The usage of these commands, plus the "tail" of the data set, are shown below.
quakes <- earthquake::quakes # loads quakes data with data set quakes <- quakes %>% eq_clean_data() %>% eq_location_clean() tail(quakes)
To save a couple steps of command typing, the eq_load_clean_data command will load the quakes data and do the cleaning steps all with a single command call:
quakes <- eq_load_clean_data() tail(quakes)
If you wish to use NOAA Significant Earthquake data updated after April, 2019, please visit the NOAA Significant Earthquakes site at https://www.ngdc.noaa.gov/nndc/struts/form?t=101650&s=1&d=1 and download the data to your working directory. Then you may load and clean data for analysis using the following sequence of commands (replace filename with the location of your file):
filename <- system.file('extdata', 'earthquakes.txt', package = 'earthquake') quakes_from_raw <- readr::read_delim(filename, delim = '\t') quakes_from_raw_clean <- quakes_from_raw %>% eq_clean_data() %>% eq_location_clean() tail(quakes_from_raw_clean)
This package includes a "timeline" capability to visualize countries' significant earthquakes. The timeline, which is a ggplot2 geom called geom_timeline, when used correctly shows the timeline of a country's significant earthquakes, with points colored and sized by number of deaths and Richter scale strength, respectively. The timeline plots years on the x-axis and any number of countries stacked on the y-axis.
There is a second ggplot2 geom in this package, called geom_timeline_label, that will label the strongest earthquakes on the timeline for each country.
Additionally, a ggplot2 theme, theme_eq, is provided with this package, which will make your charts much more attractive (in our humble opinion!).
Load clean data to be used for all charts below:
quakes <- eq_load_clean_data()
The following example shows how to make a simple timeline geom for a single country. Notice that the quakes data must be filtered by COUNTRY and DATE variables. For best results, you should use the following aes = variable combinations:
x = DATEy = COUNTRYcolor = TOTAL_DEATHSsize = EQ_PRIMARYquakes %>% dplyr::filter(COUNTRY == 'USA') %>% dplyr::filter(DATE > '2000-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To create a basic timeline with two countries, simply change how you filter the COUNTRY value:
quakes %>% dplyr::filter(COUNTRY %in% c('USA', 'UK')) %>% dplyr::filter(DATE > '2000-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To add labels to the timelines, use the geom_timeline_label geom with the following aes = variable combinations:
x = DATEy = COUNTRYmagnitude = EQ_PRIMARYlabel = LOCATION_NAMEn_max = <integer, suggest 5>quakes %>% dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>% dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY, label = LOCATION_NAME, n_max = 5)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To make the charts look slightly more attractive, use the included ggplot2 theme: theme_eq.
quakes %>% dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>% dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY, label = LOCATION_NAME, n_max = 5)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths') + theme_eq()
All the above can be an awful lot of typing to get these charts. So, this package includes a handy eq_timeline wrapper function to make a nice labeled (or not!) chart with the default aes values and theme selection. This is much easier than typing in all of the above.
Here is an example with one country and no labels on the earthquakes.
quakes %>% eq_timeline(countries = 'NEW ZEALAND', date_min = as.Date('1995-01-01'), date_max = as.POSIXct('2015-01-01'), label_n = 0)
Here is an example of multiple countries and up to 5 labels per country.
quakes %>% eq_timeline(countries = c('NEW ZEALAND', 'HAITI'), date_min = '2000-01-01', date_max = '2015-01-01', label_n = 5)
Finally, this package includes functions for creating an interactive map of the earthquakes using the leaflet package.
Call the eq_map function with cleaned and filtered data, and specify an annotation column with the argument annot_col:
quakes %>% dplyr::filter(COUNTRY == 'JAPAN') %>% dplyr::filter(lubridate::year(DATE) >= 2000) %>% eq_map(annot_col = 'DATE')
If you'd like more useful annotations, first call quakes <- quakes %>% dplyr::mutate(popup_text = eq_create_label(.)) to create a data column with formatted HTML for more useful quakes information, and then call the eq_map function:
quakes %>% dplyr::filter(COUNTRY == 'MEXICO') %>% dplyr::filter(lubridate::year(DATE) >= 2000) %>% dplyr::mutate(popup_text = eq_create_label(.)) %>% eq_map(annot_col = 'popup_text')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.