library(dplyr)
library(readr)
library(rEarthQuakes)

Mastering Software Development in R Capstone

The following Vignette is intended to explain the functions found in this package.

This package is centered around a dataset obtained from the U.S. National Oceanographic and Atmospheric Administration (NOAA) on significant earthquakes around the world. The NOAA dataset contains information about 5,933 earthquakes over an approximately 4,000 year time span.

The software package that can be used to work with the NOAA Significant Earthquakes dataset. This dataset has a substantial amount of information embedded in it that may not be immediately accessible to people without knowledge of the intimate details of the dataset or of R. It provides tools for processing and visualizing the data so that others may extract some use out of the information embedded within.

Dataset

The dataset for which this package is designed can be obtained from the NOAA Significant Earthquake Database. Alternatively, the following code snippet will read the data from the site directly.

data <- readr::read_delim("https://www.ngdc.noaa.gov/nndc/struts/results?type_0=Exact&query_0=$ID&t=101650&s=13&d=189&dfn=signif.txt", delim = "\t")

Examples

eq_clean_data()

This function cleans the NOAA dataset to allow correct processing. The raw NOAA dataset is in tab-delimited format and can be read in using the read_delim() function in the readr package.

The function eq_clean_data() takes a raw NOAA data frame and returns a clean data frame. The clean data frame has the following:

  1. A date column created by uniting the year, month, day and converting it to the Date class
  2. LATITUDE and LONGITUDE columns converted to numeric class
  3. Location names cleaned with the (internal) eq_location_clean() function
data %>%
        eq_clean_data() %>%
        dplyr::filter(COUNTRY == "MEXICO" & YEAR >= 2000) %>% 
        dplyr::mutate(popup_text = eq_create_label(.)) %>%
        eq_map(annotation = "popup_text")

eq_map()

The function eq_map() takes a cleaned NOAA data frame with earthquakes to visualize. It generates a leaflet map of the epicenters (LATITUDE/LONGITUDE) and annotates each earthquake with a popup.

data %>%
        eq_clean_data() %>%
        dplyr::filter(COUNTRY == "BELGIUM") %>%
        eq_map(annotation = "DATE")

eq_create_label()

The function eq_create_label() creates an annotation for each earthquake showing the location, magnitude (EQ_PRIMARY), and total number of deaths (TOTAL_DEATHS). Missing values will be ignored during label generation.

data %>%
        eq_clean_data() %>% 
        dplyr::filter(COUNTRY %in% c("USA", "CANADA") & YEAR >= 2000) %>%
        dplyr::mutate(popup_text = eq_create_label(.)) %>%
        eq_map(annotation = "popup_text")

geom_timeline()

The function geom_timeline() plots earthquakes from a cleaned NOAA data frame on a timeline and optionally grouped by country, colored by number of casualties, and sized by scale.

data %>%
        eq_clean_data() %>%
        filter(COUNTRY %in% c("USA", "MEXICO"), YEAR > 2000) %>%
        ggplot(aes(x = DATE, y = COUNTRY,
                   color = as.numeric(TOTAL_DEATHS), size = as.numeric(EQ_PRIMARY) )) +
        geom_timeline() +
        theme_timeline() +
        labs(size = "Richter scale value", color = "# deaths")

geom_timeline_label()

This function plots timeline labels of earthquakes based on cleaned NOAA data. It should be used in combination with \code{geom_timeline}. The required aesthetics for this geom is a \code{label} that should describe each data point.

data %>%
        eq_clean_data() %>%
        filter(COUNTRY %in% c("USA", "MEXICO"), YEAR > 2000) %>%
        ggplot(aes(x = DATE, y = COUNTRY,
                   color = as.numeric(TOTAL_DEATHS), size = as.numeric(EQ_PRIMARY) )) +
        geom_timeline() +
        geom_timeline_label(aes(label = LOCATION_NAME), n_max = 5) +
        labs(size = "Richter scale value", color = "# deaths")

theme_timeline()

This function can be used as a theme to style \code{\link{geom_timeline}} plots.

data %>%
        eq_clean_data() %>%
        filter(COUNTRY %in% c("USA", "MEXICO"), YEAR > 2000) %>%
        ggplot(aes(x = DATE, y = COUNTRY,
                   color = as.numeric(TOTAL_DEATHS), size = as.numeric(EQ_PRIMARY) )) +
        geom_timeline() +
        theme_timeline() +
        geom_timeline_label(aes(label = LOCATION_NAME), n_max = 5) +
        labs(size = "Richter scale value", color = "# deaths")


blauwers/rEarthQuakes documentation built on May 7, 2019, 2:54 a.m.