knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7 ) library(earthquakedata) library(dplyr) library(ggplot2) library(leaflet)
The earthquakedata package allows the user to clean and visualise the earthquake data set provided by the U.S. National Oceanographic and Atmospheric Administration (NOAA). This data set contains information about 5,933 earthquakes over an approximately 4,000 year time span.
Download here
The earthquakedata package has 6 functions that are exported for use by users and one internal function that is not visible:
eq_clean_data()
geom_timeline()
geom_timeline_label()
theme_timeline()
eq_map()
eq_create_label()
To be able to visualise the data set properly, it is necessary to clean the data first.
This is achieved by:
DATE
column by uniting YEAR
, MONTH
and DAY
and converting it to the Date classLATITUDE
and LONGITUDE
columns to numeric classLOCATION_NAME
column and converting it to title caseclean_data <- readr::read_delim("../earthquakes.tsv.gz",delim = "\t") %>% eq_clean_data()
The easiest way to display the earthquake data for multiple countries over a large time period is by using a timeline.
To assist the user with this, the geom_timeline()
function can be used in conjunction with the ggplot2
package. The geom_timeline()
function uses the following aesthetics - please note that the user is able to change this to other columns if required
DATE
COUNTRY
EQ_PRIMARY
DEATHS
readr::read_delim("../earthquakes.tsv.gz",delim = "\t") %>% eq_clean_data() %>% dplyr::filter(COUNTRY == c("USA") & lubridate::year(DATE) >= 2000) %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY,size = EQ_PRIMARY, colour = DEATHS))
The above sample code will display a very basic graph using the default ggplot2
theme. To allow the user to see some more information and to make the graph a bit more pleasing to the eye, the geom_timeline_label()
and theme_timeline()
functions were added to the package.
The geom_timeline_label()
function adds a vertical line with a text label to identify specific earthquakes easily. To manage the number of labels that will be plotted, the n_max
parameter is used. It will limit the number of labels to the n_max
amount and will show the labels for the earthquake events with the highest magnitude.
The theme_timeline()
function is a modified version of theme_classic()
and does the following:
Below is an example that reads the data set, cleans it, filters it to specific countries and dates, plots it, labels the top 5 events and uses the correct theme. Please note that the labels for the legend are added as well along with a title for the plot:
readr::read_delim("../earthquakes.tsv.gz",delim = "\t") %>% eq_clean_data() %>% dplyr::filter(COUNTRY == c("USA","MEXICO") & lubridate::year(DATE) >= 2000) %>% ggplot() + geom_timeline(aes(x = DATE, y = COUNTRY,size = EQ_PRIMARY, colour = DEATHS)) + geom_timeline_label(aes(x = DATE, y = COUNTRY, label = LOCATION_NAME, size = EQ_PRIMARY), n_max = 5) + ggtitle("Earthquake Timeline") + theme_timeline() + labs(size = "Richter Scale value:", colour = "# of Deaths:")
Using the leaflet
package, it is possible to plot the earthquake information on an interactive map. The user can use the eq_map()
function to plot the earthquakes on a map. Each earthquake is indicated by a blue circle. The size of the circle on the map is relative to the magnitude of the earthquake it represents. The annot_col
parameter used by the eq_map
function reflects a column in the data set that must be displayed when the user clicks on a specific earthquake on the map. The default value is DATE
readr::read_delim("earthquakes.tsv.gz", delim = "\t") %>% eq_clean_data() %>% dplyr::filter(COUNTRY == "MEXICO" & lubridate::year(DATE) >= 2000) %>% eq_map(annot_col = "DATE")
The eq_create_label()
function is used to provide even more information. It takes the data set as an argument and creates an HTML label that can be displayed on the map as a pop-up. The label consists of the Location, Magnitude and Total deaths. If one of these fields are missing a value, the field is ignored when the label is built
readr::read_delim("earthquakes.tsv.gz",delim = "\t") %>% eq_clean_data() %>% dplyr::filter(COUNTRY == "MEXICO" & lubridate::year(DATE) >= 2000) %>% dplyr::mutate(popup_text = eq_create_label(.))%>% eq_map(annot_col = "popup_text")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.