The 'NOAAearthquake' package

Overview

The 'NOAAearthquake' package is a package to clean, visualise and map NOAA earthquakes data. The functions to clean the data are 'eq_clean_data' and 'eq_location_clean' functions. Functions to visualise the data are 'geom_timeline' and 'geom_timeline_label' functions and to map are 'eq_map' and 'eq_create_label' functions. Below there are a short description and an example of these functions. The usd packages are readr, dplyr, magrittr, tidyr, stringi, scales, ggplot2, grid and leaflet.

library(readr)
library(dplyr)
library(magrittr)
library(tidyr)
library(stringi)
library(scales)
library(ggplot2)
library(grid)
library(leaflet)
library(NOAAearthquake)

Cleaning data

The 'eq_clean_data' function - creates a 'DATE' variable from 'YEAR', 'MONTH' and 'DAY' columns, - adjusts the longitude and longitude variable into numerical variables - creates a 'locations' variable from 'LOCATION_NAME' column This function works together with the 'eq_location_clean' function that turn the 'COUNTRY' column into title case.

fileName <- "earthquakes.tsv.gz"
file <- readr::read_delim(fileName, delim = "\t")
countryCol <- which(names(file) %in% "COUNTRY")
locations_cleaned <- eq_location_clean(file, countryCol)
locations_cleaned
fileName <- "earthquakes.tsv.gz"
file <- readr::read_delim(fileName, delim = "\t")
file_cleaned <- eq_clean_data(file)
file_cleaned

Visualising data

The 'geom_timeline' function creates a timeline with the following elements: - eacht timeline represents a country - the earthquake events are displayed as circles - the size of the earthquake indicted the magnitude of the earthquakes - the color of the circles represents the number of deaths of an earthquake The location of each earthquake is created by the 'geom_timeline_label' function.

library(magrittr)
fileName <- "earthquakes.tsv.gz"
file <- readr::read_delim(fileName, delim = "\t")
data <- eq_clean_data(file) %>%
  dplyr::filter(COUNTRY == c("China", "Usa", "Japan"),
         DATE >= "1999-01-01",
         DATE <= "2012-12-31")
ggplot() +
  geom_timeline(data, aes(x = DATE, y = COUNTRY, size = richterScaleValue, fill = DEATHS)) +
  theme(legend.position = "bottom") +
  geom_timeline_label(aes(x=DATE, label = locations, n_maxVar = EQ_PRIMARY, n_max=10)) +
  labs(x = "Date", y = "Country", fill = "# deaths", size = "Richter scale value") 

Mapping

The 'eq_map' function maps the NOAA earthquakes data on a geographical map using the 'leaflet' package. The earthquakes are circles and the size of the circles is the magnitude of the earthquakes. When hovering over an earthquake on the map, a pop-up window displays information about the location, the magnitude and the number of deaths of the respective earthquake. The latter is done by the 'eq_create_label' function.

file <- "earthquakes.tsv.gz"
map <- readr::read_delim(file, delim = "\t") %>% 
  eq_clean_data() %>% 
  dplyr::filter(COUNTRY == "Mexico",
                DATE >= "2000-01-01") %>% 
  dplyr::mutate(popup_text = eq_create_label(.)) %>% 
  eq_map(annot_col = "popup_text")
map 


jennychungpy/earthquakes documentation built on May 12, 2019, 2:01 p.m.