library(noaa) knitr::opts_chunk$set(fig.dpi = 96)
This document serves as the main overview of the noaa package, and will try to explain the hows and whys of the package content along with clear visual examples.
The U.S. National Oceanographic and Atmospheric Administration (NOAA) publish a database on significant earthquakes around the world. The dataset contains information about 5,933 earthquakes over an approximately 4,000 year time span.
The datasset is avialable at NOAA Link . An actual state of the dataset can be obtained at the "Download" link.
In order to support easy access to the dataset the package contains a snapshot of the avialable data. To use the data it should be loaded:
data(earthquakes)
It is useful to use the latest downloaded data from the NOAA site. eq_read_data()
is the function to use:
eq <- eq_read_data(filename="../data-raw/signif.txt.bz2" )
The detailed description of the dataset is supported by NOAA at the "Event Variable Definitions" link.
The documentation also included in the package: man/earthquakes.Rd
, use ?earthquakes
to see it.
The raw NOAA data is not ideal for R processing. There are two helper functions to clean the data.
NOAA dataset contains separate fields for the date - time: 'YEAR','MONTH','DAY','HOUR','MINUTE','SECOND'.
eq_clean_data()
create a new R datetime field date
and deletes the separate fields.
To clean the built in earthquakes data you should run :
data(earthquakes) eq <- eq_clean_data(earthquakes) head(eq$date)
NOAA dataset has a "LOCATION" filed containing all caps text wth country information added.
To display this field a simple transformation is done by eq_location_clean
. The function strips out
the "COUNTRY" part and formats the string as "Title Case".
data(earthquakes) # NOAA version head(earthquakes$LOCATION_NAME) eq <- eq_location_clean(earthquakes) # Cleaned version head(eq$LOCATION_NAME)
The package contains a ggplot2 geom to visualize the times at which earthquakes occur within certain countries.
The geom_timeline()
main use to display a timeline of events:
The function geom_timeline_label()
adds text directly to the timeline plot.
The param n_max
defines the number of the max size events to display.
library(magrittr) library(ggplot2) library(lubridate) library(dplyr) data(earthquakes) eqdta <- eq_location_clean(eq_clean_data(earthquakes)) %>% filter(date > lubridate::ymd("20000101") & (COUNTRY == "USA" | COUNTRY == 'CHINA')) %>% group_by(COUNTRY) # Draw the geom ggplot (data = eqdta, aes( x = date, y = COUNTRY, size = FOCAL_DEPTH, label = LOCATION_NAME, colour = EQ_PRIMARY )) + geom_timeline() + geom_timeline_label(n_max = 6)
The eq_map()
function takes an argument data containing the filtered data frame
with earthquakes to visualize. The function maps the epicenters
(LATITUDE/LONGITUDE) and annotates each point with in pop up window containing
annotation data stored in a column of the data frame.
The eq_create_label()
adds an informative pop up text to a dataframe.
The usage of both function is:
library(magrittr) data(earthquakes) earthquakes %>% eq_clean_data() %>% dplyr::filter(COUNTRY == "MEXICO" & lubridate::year(date) >= 2000) %>% dplyr::mutate(popup_text = eq_create_label(.)) %>% eq_map(annot_col = "popup_text")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.