knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(earthquake) library(dplyr) library(ggplot2)
This package provides tools for processing and visualizing a dataset obtained from the U.S. National Oceanographic and Atmospheric Administration (NOAA) on significant earthquakes around the world. This dataset contains information about 5,933 earthquakes over an approximately 4,000-year time span. The dataset has a substantial amount of information embedded in it that may not be immediately accessible to people without knowledge of the intimate details of the dataset. Our goal is to enable others to gain some use out of the information embedded within.
The following functions enable you to read the source file and clean it in preparation for visualization.
data("earthquakes")
Use eq_clean_data to perform the following series of edits to clean the data frame:
earthquakes <- eq_clean_data(earthquakes)
Use the eq_location_clean function to edit the location-related variables in the data frame:
earthquakes <- eq_location_clean(earthquakes)
Use eq_select_data to subset the data frame to the following variables:
earthquakes <- eq_select_data(earthquakes)
Use the eq_count_events function to identify the countries with events in a specified date range. eq_count_events returns a data frame listing the country and count of events in descending order of count.
events <- eq_count_events(earthquakes, minimum_date = "2000-01-01", maximum_date = "2018-12-31") knitr::kable(head(events, 10), caption = "Top 10 countries by number of events in descending order.", col.names = c("Country", "Number of Events"), align = "lr")
Use eq_filter_data to subset your earthquakes data frame to the events to be visualized. eq_filter_data accepts four arguments: a data frame containing the source data, a character vector containing country names and date values for the minimum and maximum dates. When performing the filter, the function will return all events with dates between and including the minimum and maximum dates.
quakes1 <- eq_filter_data(earthquakes, countries = c("Indonesia", "Japan", "Russia"), minimum_date = "2000-01-01", maximum_date = "2018-12-31")
To create a plot of events for all countries on a single timeline, use the geom_timeline function in conjunction with ggplot. The x aesthetic holds the DATE variable. The xmin and xmax aesthetics hold the minimum and maximum dates for the timeline and the color and size aesthetics can be set using TOTAL_DEATHS and EQ_PRIMARY variables respectively to color and size the points based on the number of deaths and magnitude of each event.
ggplot(data = quakes1, aes(x = DATE, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline(xmin = "2000-01-01", xmax = "2018-12-31") + labs(title = "NOAA Significant Earthquakes", subtitle = "Plot of events for Indonesia, Japan and Russia combined.", x = "Timeline", y = "", color = "Total Deaths", size = "Magnitude")
By passing the COUNTRY variable to the y aesthetic, you can plot separate timelines for each country within the plot.
ggplot(data = quakes1, aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline(xmin = "2000-01-01", xmax = "2018-12-31") + labs(title = "NOAA Significant Earthquakes", subtitle = "Plot of events for Indonesia, Japan and Russia on individual timelines.", x = "Timeline", y = "", color = "Total Deaths", size = "Magnitude")
Using geom_timeline_label, you can add labels for a selected number of ranked events by variable. When using geom_timeline_label, use the n_max aesthetic to identify the number of ranked events to label; then use the label and magnitude aesthetics to create the label and select the variable to use in the ranking. ggplot will add a vertical line and label for each event.
ggplot(data = quakes1, aes(x = DATE, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline(xmin = "2000-01-01", xmax = "2018-12-31") + geom_timeline_label(n_max = 5, aes(label = LOCATION_NAME, magnitude = EQ_PRIMARY)) + labs(title = "NOAA Significant Earthquakes", subtitle = "Plot of events for Indonesia, Japan and Russia combined.", x = "Timeline", y = "", color = "Total Deaths", size = "Magnitude")
Labeling works for separate timelines as well. Note that, in this case, it identifies the top n_max events across all events in the data frame (as opposed to finding the top n_max events for each country) and labels them.
ggplot(data = quakes1, aes(x = DATE, y = COUNTRY, color = TOTAL_DEATHS, size = EQ_PRIMARY)) + geom_timeline(xmin = "2000-01-01", xmax = "2018-12-31") + geom_timeline_label(n_max = 5, aes(label = LOCATION_NAME, magnitude = EQ_PRIMARY)) + labs(title = "NOAA Significant Earthquakes", subtitle = "Plot of events for Indonesia, Japan and Russia on individual timelines.", x = "Timeline", y = "", color = "Total Deaths", size = "Magnitude")
Use eq_filter_data in conjunction with the eq_create_label and eq_map functions to generate an interactive map of historical earthquakes for a given geographical location. eq_map plots events as circles on a leaflet map with radii of the circles proportional to the magnitude of the earthquakes. eq_create_label generates optional labels for each plotted event so that, when you click the event on a map, a popup label displays information about the individual event.
eq_filter_data(earthquakes, countries = c("Mexico"), minimum_date = "1980-01-01", maximum_date = "2018-12-31") %>% mutate(POPUP_TEXT = eq_create_label(.)) %>% eq_map(annot_col = "POPUP_TEXT")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.