knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning= FALSE, message=FALSE )
library(NOAAsignifEarthQuakes) # additionally for piping library(magrittr)
This package is the result of a project centered around a dataset obtained from the U.S. National Oceanographic and Atmospheric Administration (NOAA). This dataset is focussed on significant earthquakes around the world and contains information about 5,933 earthquakes over an approximately 4,000 year time span.
The objective of this package is to:
National Geophysical Data Center / World Data Service (NGDC/WDS): Significant Earthquake Database. National Geophysical Data Center, NOAA. (doi:10.7289/V5TD9V7K)
The data is incorporated into this package in file r system.file("extdata","signif.txt",package="NOAAsignifEarthQuakes")
desc_func <- dplyr::tbl_df( data.frame( Function = ls(getNamespace("NOAAsignifEarthQuakes")) ) ) %>% dplyr::mutate( Processing = Function %in% c("load_NOAA_db","eq_build_date","eq_build_location","eq_clean_data"), Timeline= Function %in% c("eq_legend_timeline","geom_timeline","geom_timeline_label","timeline_data"), Map=Function %in% c("eq_create_label","eq_map") ) %>% dplyr::arrange(desc(Processing),desc(Timeline),desc(Map)) %>% dplyr::mutate( Processing = ifelse(Processing,"✔",""), Timeline= ifelse(Timeline,"✔",""), Map=ifelse(Map,"✔","") ) knitr::kable( desc_func, align=c("l","c","c","c"), col.names=c("Function","Data Processing","Timeline visualization","Map visualization") )
We use 2 main function to read and clean the data:
1. load_NOAA_db
: read the raw NOAA file
1. eq_clean_data
: process the resulting data frame using support finctions:
* eq_build_date
: to create the date feature from input year, month and day
* eq_build_locaton
: to proccess the location feature in a more human readable way
load_NOAA_db
file_noaa <- system.file("extdata","signif.txt",package="NOAAsignifEarthQuakes",mustWork=TRUE) noaa_raw <- load_NOAA_db(file_noaa)
noaa_name <- colnames(noaa_raw)
We odtain a table with r ncol(noaa_raw)
features and r nrow(noaa_raw)
observations. We describe the schema of the raw data in detail in the appendix.
noaa_clean <- eq_clean_data(noaa_raw)
clean_cols <- names(noaa_clean)
eq_build_date
We simply take the YEAR
, MONTH
and DAY
features in order to create the date feature.
We demonstrate on the first 5 and last % raw of the original data building a valid date feature named clean_date
knitr::kable( head(noaa_raw,5) %>% dplyr::select("DAY","MONTH","YEAR") %>% dplyr::mutate(clean_date=eq_build_date(.)) )
knitr::kable( tail(noaa_raw,5) %>% dplyr::select("DAY","MONTH","YEAR") %>% dplyr::mutate(clean_date=eq_build_date(.)) )
eq_build_location
This function consists on: Removing the Country Extra location information within parethesis Correct white spaces Switch to Title case
knitr::kable( head(noaa_raw,10) %>% dplyr::select("LOCATION_NAME") %>% dplyr::mutate(clean_location=eq_build_location(.)) )
eq_clean_data
Mainly this funcions consist of processing existing the features with the helper function above and select features reducing the raw data to r ncol(noaa_clean)
features but still r nrow(noaa_clean)
observations:
r clean_cols[1]
: date of the earthqukes eventr clean_cols[2]
: country where the earthquake occuredr clean_cols[3]
: location of the earthquaker clean_cols[4]
: Longitude coordinate of the Earthquake epicenterr clean_cols[5]
: Latitude coordinate of the Earthquake epicenterr clean_cols[6]
: Total number of fatalities caused by the earthquakesr clean_cols[7]
: Equivelent Richter scale MagnitudeThe last 1 row of the resulting table are:
knitr::kable(tail(noaa_clean,10))
timeline_data
This function consist on preparing the data prioir to building the timeline. It enable to:
LATITUDE
and LONGITUDE
)dmin
and dmax
countries
MAG_RANK
giving the rank order of earthquake by country in descreasing order of Magnitudefilt_noaa <- noaa_clean %>% timeline_data(dmin='2010-01-01',dmax='2011-01-01',countries=c("USA","China"))
knitr::kable(filt_noaa)
geom_timeline
g_usa <- geom_timeline(noaa_clean, countries='USA', xmin='2000-01-01', xmax='2017-01-01')
This geom takes for optional aesthetics
y
: a factor variable such as COUNTRY
to display mutiple timelinessize
: a continuous variable scaling the point symbol (usually set to MAG
)fill
: a continuous variable scaling the color symbol (usually set to DEATHS
)g_usachina <- geom_timeline(noaa_clean, mapping = ggplot2::aes( y=COUNTRY, fill=DEATHS, size= MAG ), countries=c('USA',"China"), xmin='2000-01-01', xmax='2017-01-01') g_usachina
eq_legend_timeline
In order to present proper legend label we build function eq_legend_timeline
.
# define fake aesthetics feat <- ggplot2::aes(aes_1=DATE,aes_2=MAG,aes_3=DEATHS,aes_4=COUNTRY,aes_5=LOCATION_NAME) # test eq_legend_timeline in a table legend_aes <- dplyr::tbl_df(data.frame( aesthetic=c('aes_1','aes_2','aes_3','aes_4','aes_5') )) %>% dplyr::rowwise() %>% dplyr::mutate( feature= rlang::quo_name(feat[[aesthetic]]), label= eq_legend_timeline(feat[[aesthetic]])) knitr::kable(legend_aes)
geom_timeline_label
The function take n_max
ads keyword to set the maximum number of locations to display. By default this value is set to 5.
We go back to the first timeline and display the location of the 10 most significant earthquakes
g_usa + geom_timeline_label(n_max=10)
We go back to the secondt timeline and display the location of the 5 most significant earthquakes
g_usachina + geom_timeline_label()
To create a map we have to filter the processed data for a country and specific date range so as to avoid overflowing the visualisation.
filt_noaa <- noaa_clean %>% dplyr::filter(COUNTRY == "MEXICO" & lubridate::year(DATE) >= 2000)
eq_map
Each point correspond to an earthquake with the size of the circle represnting the magnitude.
eq_map(filt_noaa,annot_col = "DATE")
eq_map
with eq_create_label
Annotation is performed with eq_create_label
.
It simply consists on combining the LOCATION_NAME
, DEATHS
and MAG
feature into a singled html encoded character. As demonstrated in the first 10 rows of our test case.
annot_noaa <- filt_noaa %>% head(10) %>% dplyr::mutate(annot_text=eq_create_label(.)) %>% dplyr::select("DATE","LOCATION_NAME","DEATHS","MAG","annot_text") knitr::kable(annot_noaa)
It is call by eq_map
when option annot_col
is set to "popup_test"
, thus giving the following result.
eq_map(filt_noaa,annot_col = "popup_text")
The NOAAsignifEarthQuakes
performs in a quite straitforward way 3 things
\appendix
We give a short description of the feature present in the raw data. We indicate the column type as we set it while reading the data.
r noaa_name[1]
: [Character] unique id for the earthquaker noaa_name[2]
: [Character] Categorical set to Tsu if the earthquake generated a Tsunami (set to NA otherwise)r noaa_name[3]
: [Integer] 4 digit year corresponding the event, range -2150 to 2018r noaa_name[4]
: [Integer] 2 digit month corresponding the event, could be NAr noaa_name[5]
: [Integer] 2 digit day of the month corresponding to the eventr noaa_name[6]
: [Integer] 2 digit hour corresponding the eventr noaa_name[7]
: [Integer] 2 digit minute corresponding to the eventr noaa_name[8]
: [Numeric] 2 digit seconds corresponding to the eventr noaa_name[9]
: [Integer] Depth of the earthquake [km] (Valid values: 0 to 700 km)r noaa_name[10]
: [Numeric] Equivalent Magnitude, take the value from one of the column below (supposidly only up to one contains a non-missing value)r noaa_name[11]
: [Numeric] Magnitude based on the moments magnitue scale (Valid values: 0.0 to 9.9)r noaa_name[12]
: [Numeric] Surface-wave magnitude (Valid values: 0.0 to 9.9)r noaa_name[13]
: [Numeric] Compressional body wave (P-wave) magnitude (Valid values: 0.0 to 9.9)r noaa_name[14]
: [Numeric] Standard Richter Magintude (Valid values: 0.0 to 9.9)r noaa_name[15]
: [Numeric] Magnitude estimated for field area, before sesmic instruments were in use (Valid values: 0.0 to 9.9)r noaa_name[16]
: [Numeric] Magnitude estimated using and unidentified method (Valid values: 0.0 to 9.9)r noaa_name[17]
: [Character] Modified Mercalli Intensity (Valid values: 1 to 12)r noaa_name[18]
: [Character] Country name where the event occuredr noaa_name[19]
: [Character] State where the event occured, r noaa_name[20]
: [Character] Location name ex: City or geographical landmarkr noaa_name[21]
: [Double] Latitude coordinates of the eventr noaa_name[22]
: [Double] Longitude coordinates of the eventr noaa_name[23]
: [Character] Event location, contains country name and localityr noaa_name[24]
: [Integer] Number of deathsr noaa_name[25]
: [Character] Factorial code 1 to 4r noaa_name[26]
: [Integer] Number of Missing personsr noaa_name[27]
: [Character] Factorial code 1 to 4r noaa_name[28]
: [Integer] Number of injured personsr noaa_name[29]
: [Character] Factorial code 1 to 4r noaa_name[30]
: [Numeric] Damage amount in M$r noaa_name[31]
: [Character] Factorial code 1 to 4r noaa_name[32]
: [Integer] Number of Houses destroyedr noaa_name[33]
: [Character] Factorial code 1 to 4r noaa_name[34]
: [Integer] Number of Houses damagedr noaa_name[35]
: [Character] Factorial code 1 to 4r noaa_name[36]
: [Integer] Number of deathsr noaa_name[37]
: [Character] Factorial code 1 to 4r noaa_name[38]
: [Integer] Number of Missing personsr noaa_name[39]
: [Character] Factorial code 1 to 4r noaa_name[40]
: [Integer] Number of injuredr noaa_name[41]
: [Character] Factorial code 1 to 4r noaa_name[42]
: [Numeric] Damage amount in M$r noaa_name[43]
: [Character] Factorial code 1 to 4r noaa_name[44]
: [Integer] Number of Houses destroyedr noaa_name[45]
: [Character] Factorial code 1 to 4r noaa_name[46]
: [Integer] Number of Houses damagedr noaa_name[47]
: [Character] Factorial code 1 to 4Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.