knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Intro

The package quakeR is designed for working with the NOAA Significant Earthquakes dataset. The dataset has a substantial amount of information that is not immediately accessible to people without knowledge of the intimate details of the dataset or of R. This package provides the tools for processing and visualizing the data so that others may extract some use out of the information embedded within.

Installaton

library(quakeR)

The Data

This project is centered around a dataset obtained from the U.S. National Oceanographic and Atmospheric Administration (NOAA) on significant earthquakes around the world. This dataset contains information about 5,933 earthquakes over an approximately 4,000 year time span.

The data set can be downloaded from this link. Alternatively, a copy of the dataset is included with this package, and can be loaded in R with the following code:

filename <- system.file("extdata/earthquakes.tsv.gz", package = "quakeR")
raw_data <- readr::read_delim(filename, delim = "\t")

Functions

Clean Data

Takes raw NOAA data frame and returns a clean data frame. The clean data frame has the following: A date column created by uniting the year, month, day and converting it to the Date class LATITUDE and LONGITUDE columns converted to numeric class.

library(dplyr, warn.conflicts = FALSE)

clean_data <- eq_clean_data(raw_data)
clean_data %>% 
  arrange(desc(DATE)) %>% 
  select(DATE, COUNTRY, LOCATION_NAME, LATITUDE, LONGITUDE, DEATHS) %>% 
  head()

Plot Earthquake Timelines

A geom for ggplot2 called geom_timeline() is used for plotting a time line of earthquakes ranging from xmin to xmax dates with a point for each earthquake. Optional aesthetics include color, size, and alpha (for transparency). The x aesthetic is a date and an optional y aesthetic is a factor indicating some stratification in which case multiple time lines will be plotted for each level of the factor (e.g. country).

library(ggplot2)
library(stringr)

time_data <- 
  clean_data %>% 
  mutate_at(vars(DEATHS, EQ_PRIMARY), as.numeric) %>% 
  filter(str_detect(str_to_lower(COUNTRY), "china|usa$|pakistan")) %>%
  filter(DATE >= "2000-01-01")

time_data %>% 
  ggplot(aes(x = DATE,
             y = COUNTRY,
             size = EQ_PRIMARY,
             fill = DEATHS))+
  geom_timeline()+
  theme_timeline()

Add labels to largest earthquakes.

time_data %>% 
    ggplot(aes(x = DATE, y = COUNTRY))+
    geom_timeline(aes(size = DEATHS, fill = EQ_PRIMARY))+
    geom_timeline_label(aes(label = LOCATION_NAME, 
                        n_max_var = DEATHS), 
                        n_max = 5)+
  theme_timeline()

Earthquake Maps

The function called eq_map() takes an argument data containing the filtered data frame with earthquakes to visualize. The function maps the epicenters (LATITUDE/LONGITUDE) and annotates each point with in pop up window containing annotation data stored in a column of the data frame. The user can choose which column is used for the annotation in the pop-up with a function argument named annot_col. Each earthquake is shown with a circle, and the radius of the circle is proportional to the earthquake’s magnitude (EQ_PRIMARY).

map_data <-
  raw_data %>%
  eq_clean_data() %>%
  dplyr::filter(COUNTRY == "MEXICO",
                lubridate::year(DATE) >= 2000)

eq_map(map_data, annot_col = "LOCATION_NAME")

The function called eq_create_label() takes the dataset as an argument and creates an HTML label that can be used as the annotation text in the leaflet map. This function puts together a character string for each earthquake that will show the cleaned location (as cleaned by the eq_location_clean() function, the magnitude (EQ_PRIMARY), and the total number of deaths (TOTAL_DEATHS), with boldface labels for each (“Location”, “Total deaths”, and “Magnitude”). If an earthquake is missing values for any of these, both the label and the value are skipped for that element of the tag. Your code can be used in the following way:

map_data <-
  raw_data %>%
  eq_clean_data() %>%
  dplyr::filter(COUNTRY == "MEXICO",
                lubridate::year(DATE) >= 2000) %>% 
  eq_map_add_popup()

eq_map(map_data, annot_col = "popup_text")


vadimus202/quakeR documentation built on May 19, 2019, 1:47 a.m.