This package can be used to work with the NOAA (National Oceanographic and Atmospheric Administration) Significant Earthquakes dataset This dataset contains information about 5,956 earthquakes over an approximately 4,000 year time span.

More specificaly, this package implements:

Reading the Data

After downloading the data from the NOAA website, we recommend using the readr package in order to load the dataset into R :

eq_data <- readr::read_delim('signif.txt', delim = '\t')
dim(eq_data)

Cleaning the Data

library(magrittr)
library(EarthquakesNOAA)

The function clean_eq_data takes raw NOAA data frame and returns a clean data frame. The clean data frame have the following:

clean_eq_data <- eq_data %>% 
    eq_clean_data()

The function clean_loc_data cleans the LOCATION_NAME column by stripping out the country name (including the colon) and converts names to title case (as opposed to all caps). This will be needed later for annotating visualizations. This function is applied to the raw data and produce a cleaned up version of the LOCATION_NAME column.

clean_loc_data <- eq_data %>% 
    eq_location_clean()

This table shows examples of LOCATION_NAME before and after applying eq_location_clean function:

data.frame(LOC_BEFORE = head(eq_data$LOCATION_NAME),
           LOC_AFTER = head(clean_loc_data$LOCATION_NAME))

Geom TimeLine

This geom is used to plot a time line of earthquakes ranging from xmin to xmax dates with a point for each earthquake. Optional aesthetics include color, size, and alpha (for transparency). The xaesthetic is a date and an optional y aesthetic is a factor indicating some stratification in which case multiple time lines will be plotted for each level of the factor (e.g. country).

eq_data %>%
    eq_clean_data() %>%
    dplyr::filter(lubridate::year(DATE) > 2010 & COUNTRY %in% c('CHILE', 'USA')) %>%
    ggplot2::ggplot(ggplot2::aes(x = DATE, y = COUNTRY,
                                 colour = DEATHS, size = EQ_PRIMARY)) +
    geom_timeline(alpha = 0.5) +
    theme_timeline

Geom TimeLine with Labels

This geom is used in combination with `` in order to add annotations to the earthquake data. This geom adds a vertical line to each data point with a text annotation (e.g. the location of the earthquake) attached to each line. There exist an option to subset n_max number of earthquakes, where we take the n_max largest (by magnitude) earthquakes. Aesthetics are x, which is the date of the earthquake and label which takes the column name from which annotations will be obtained.

eq_data %>%
    eq_clean_data() %>%
    dplyr::filter(lubridate::year(DATE) > 2010 & COUNTRY %in% c('CHILE', 'USA')) %>%
    eq_location_clean() %>%
    ggplot2::ggplot(ggplot2::aes(x = DATE, y = COUNTRY,
                                 colour = DEATHS, size = EQ_PRIMARY)) +
    geom_timeline(alpha = 0.5) +
    geom_timelinelabel(ggplot2::aes(label = LOCATION_NAME, n_max = 3)) +
    theme_timeline

Leaflet Map

The function eq_map takes an argument data containing the filtered data frame with earthquakes to visualize. The function maps the epicenters (LATITUDE/LONGITUDE) and annotates each point with in pop up window containing annotation data stored in a column of the data frame. The user can choose which column to be used for the annotation in the pop-up with a function argument named annot_col. Each earthquake is shown with a circle, and the radius of the circle is proportional to the earthquake's magnitude (EQ_PRIMARY).

eq_data %>%
    eq_clean_data() %>%
    dplyr::filter(COUNTRY == 'MEXICO' & lubridate::year(DATE) >= 2000) %>%
    eq_map(annot_col = 'DATE')

Leaflet Map with a More Useful Pop-up Label

The function eq_create_label takes the dataset as an argument and creates an HTML label that can be used as the annotation text in the leaflet map. This function puts together a character string for each earthquake that will show the cleaned location (as cleaned by the eq_location_clean() function), the magnitude (EQ_PRIMARY), and the total number of deaths (TOTAL_DEATHS). If an earthquake is missing values for any of these, both the label and the value are skipped for that element of the tag.

eq_data %>%
    eq_clean_data() %>%
    dplyr::filter(COUNTRY == 'MEXICO' & lubridate::year(DATE) >= 2000) %>%
    dplyr::mutate(popup_text = eq_create_label(.)) %>%
    eq_map(annot_col = 'popup_text')


blnash508/EarthquakesNOAA documentation built on May 14, 2019, 5:25 p.m.