knitr::opts_chunk$set(fig.width = 6, fig.height = 5, fig.align = 'center')
The earthquake
package is a small R package for cleaning, timelining, and mapping NOAA Significant Earthquake data. This R package was built to satisfy the requirements of the Capstone project for the Coursera Mastering Software Development in R 5-course specialization.
To install the earthquake
package, the user must install the devtools
package. Then, to download and install the earthquake
package, use the following commands:
devtools::install_github('ZYDI/Earthquake') library(earthquake)
To work through these examples, the following packages need to be installed and loaded:
library(earthquake) library(dplyr) library(ggplot2) library(readr) library(lubridate)
The NOAA Significant Earthquake dataset, as it existed on April, 2019, is provided with this package, using the data set name quakes
. If you wish to use data updated after this date, please see the dataset documentation (?earthquake:quakes
) for source.
To load the quakes
data, simply use the earthquake::quakes
command (the earthquake::
invocation is necessary because R comes loaded with another data set called quakes
).
Then call the eq_clean_data
and eq_location
clean command to "clean" some of the variables in the data set for use with the visualization tools in this package. The usage of these commands, plus the "tail" of the data set, are shown below.
quakes <- earthquake::quakes # loads quakes data with data set quakes <- quakes %>% eq_clean_data() %>% eq_location_clean() tail(quakes)
To save a couple steps of command typing, the eq_load_clean_data
command will load the quakes
data and do the cleaning steps all with a single command call:
quakes <- eq_load_clean_data() tail(quakes)
If you wish to use NOAA Significant Earthquake data updated after April, 2019, please visit the NOAA Significant Earthquakes site at https://www.ngdc.noaa.gov/nndc/struts/form?t=101650&s=1&d=1 and download the data to your working directory. Then you may load and clean data for analysis using the following sequence of commands (replace filename
with the location of your file):
filename <- system.file('extdata', 'earthquakes.txt', package = 'earthquake') quakes_from_raw <- readr::read_delim(filename, delim = '\t') quakes_from_raw_clean <- quakes_from_raw %>% eq_clean_data() %>% eq_location_clean() tail(quakes_from_raw_clean)
This package includes a "timeline" capability to visualize countries' significant earthquakes. The timeline, which is a ggplot2
geom called geom_timeline
, when used correctly shows the timeline of a country's significant earthquakes, with points colored and sized by number of deaths and Richter scale strength, respectively. The timeline plots years on the x-axis and any number of countries stacked on the y-axis.
There is a second ggplot2
geom in this package, called geom_timeline_label
, that will label the strongest earthquakes on the timeline for each country.
Additionally, a ggplot2
theme, theme_eq
, is provided with this package, which will make your charts much more attractive (in our humble opinion!).
Load clean data to be used for all charts below:
quakes <- eq_load_clean_data()
The following example shows how to make a simple timeline geom for a single country. Notice that the quakes
data must be filtered by COUNTRY
and DATE
variables. For best results, you should use the following aes = variable
combinations:
x = DATE
y = COUNTRY
color = TOTAL_DEATHS
size = EQ_PRIMARY
quakes %>% dplyr::filter(COUNTRY == 'USA') %>% dplyr::filter(DATE > '2000-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To create a basic timeline with two countries, simply change how you filter the COUNTRY
value:
quakes %>% dplyr::filter(COUNTRY %in% c('USA', 'UK')) %>% dplyr::filter(DATE > '2000-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To add labels to the timelines, use the geom_timeline_label
geom with the following aes = variable
combinations:
x = DATE
y = COUNTRY
magnitude = EQ_PRIMARY
label = LOCATION_NAME
n_max = <integer, suggest 5>
quakes %>% dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>% dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY, label = LOCATION_NAME, n_max = 5)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths')
To make the charts look slightly more attractive, use the included ggplot2
theme: theme_eq
.
quakes %>% dplyr::filter(COUNTRY %in% c('NEW ZEALAND', 'SOUTH AFRICA')) %>% dplyr::filter(DATE > '2000-01-01', DATE < '2015-01-01') %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline_label(aes(x = DATE, y = COUNTRY, magnitude = EQ_PRIMARY, label = LOCATION_NAME, n_max = 5)) + scale_size_continuous(name = 'Richter scale value') + scale_color_continuous(name = '# of Deaths') + theme_eq()
All the above can be an awful lot of typing to get these charts. So, this package includes a handy eq_timeline
wrapper function to make a nice labeled (or not!) chart with the default aes
values and theme selection. This is much easier than typing in all of the above.
Here is an example with one country and no labels on the earthquakes.
quakes %>% eq_timeline(countries = 'NEW ZEALAND', date_min = as.Date('1995-01-01'), date_max = as.POSIXct('2015-01-01'), label_n = 0)
Here is an example of multiple countries and up to 5 labels per country.
quakes %>% eq_timeline(countries = c('NEW ZEALAND', 'HAITI'), date_min = '2000-01-01', date_max = '2015-01-01', label_n = 5)
Finally, this package includes functions for creating an interactive map of the earthquakes using the leaflet
package.
Call the eq_map
function with cleaned and filtered data, and specify an annotation column with the argument annot_col
:
quakes %>% dplyr::filter(COUNTRY == 'JAPAN') %>% dplyr::filter(lubridate::year(DATE) >= 2000) %>% eq_map(annot_col = 'DATE')
If you'd like more useful annotations, first call quakes <- quakes %>% dplyr::mutate(popup_text = eq_create_label(.))
to create a data column with formatted HTML for more useful quakes information, and then call the eq_map
function:
quakes %>% dplyr::filter(COUNTRY == 'MEXICO') %>% dplyr::filter(lubridate::year(DATE) >= 2000) %>% dplyr::mutate(popup_text = eq_create_label(.)) %>% eq_map(annot_col = 'popup_text')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.