knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  echo = TRUE,
  results = "markup",
  fig.align = "center",
  fig.height = 3,
  fig.width = 6
)

Package overview

This is a Vignette describing the functions in the NOAAsignif package. This also serves as an capstone project for Coursera's "Mastering Software Development in R" specialization. This package includes functions that download and clean up the significant earthquake dataset from US National Centers for Environmental Information website. The Significant Earthquake Database contains information on destructive earthquakes from 2150 B.C. to the present that meet at least one of the following criteria: Moderate damage (approximately $1 million or more), 10 or more deaths, Magnitude 7.5 or greater, Modified Mercalli Intensity X or greater, or the earthquake generated a tsunami. This package provides functions to visualize the significant earthquakes in a timeline, and label them on an interactive map.

library(NOAAsignif)
library(dplyr)
library(ggplot2)
library(leaflet)
library(grid)

Download and tidy the dataset

The funcion eq_download_data is used to download the dataset from NOAA website. The function eq_clean_data() cleans up the data to create a DATE column by uniting the year, month, day and converting it to the Date class, a LATITUDE and a LONGITUDE columns converted to nmeric class. The funcion eq_convert_date is an assistant function to help conver the date. The function eq_location_clean cleans the LOCATION_NAME column by stripping out the country name (including the colon) and converts names to title case (as opposed to all caps). This will be needed later for annotating visualizations.

tidy_data <- eq_download_data() %>%
    eq_clean_data() %>%
    eq_location_clean()
dim(tidy_data)

Visulize the timeline of earthquakes

GeomTimeline is the Geom created for function geom_timeline(), which takes the tidy data filtered for specific date ranges, and plot them on a timeline as multiple circles. The size of the circle corresponds to earthquakes' magnitude, and the color of the circle corresponds to number of deaths in the earthquakes. GeomTimelineLabel is the Geom created for function geom_timeline_label(), which in addition to geom_timeline(), labels the location of nmax number of biggest earthquakes. nmax can be specified by the user.

# Plot significant earthquakes occurred in the year 1999
xmin <- as.Date("1999-01-01")
xmax <- as.Date("2000-01-01")
sample_data <- filter(tidy_data, DATE >= xmin & DATE <= xmax)
ggplot(sample_data,
       aes(x = DATE, size = EQ_PRIMARY, fill = DEATHS, color = DEATHS)) +
    geom_timeline() +
    theme_minimal()

# Plot significant earthquakes occurred in the 3rd quarter of year 2000 and label the 4 biggest ones
xmin <- as.Date("2000-07-01")
xmax <- as.Date("2000-10-01")
sample_data <- filter(tidy_data, DATE >= xmin & DATE <= xmax)
ggplot(sample_data,
       aes(x = DATE, label = LOCATION, size = EQ_PRIMARY, fill = DEATHS,
           color = DEATHS, n_max = 4)) +
    geom_timeline_label() +
    theme_minimal()

Mapping earthquakes to an interactive map with labels

The function eq_map() takes an argument data containing the filtered data frame with earthquakes to visualize. The function maps the epicenters (LATITUDE/LONGITUDE) and annotates each point with in pop up window containing annotation data stored in a column of the data frame. The user is able to choose which column is used for the annotation in the pop-up with a function argument named annot_col. The function eq_create_label() takes the dataset as an argument and creates an HTML label that can be used as the annotation text in the leaflet map. This function puts together a character string for each earthquake that will show the cleaned location, the magnitude (EQ_PRIMARY), and the total number of deaths (TOTAL_DEATHS), with boldface labels for each ("Location", "Total deaths", and "Magnitude"). If an earthquake is missing values for any of these, both the label and the value will be skipped for that element of the tag.

tidy_data %>%
    filter(COUNTRY == "Mexico:" & lubridate::year(DATE) >= 2000) %>%
    eq_create_label() %>%
    eq_map(annot_col = "popup_text")


perfectslumber/NOAAsignif documentation built on May 23, 2019, 5:04 a.m.