title: "The Capstone Project" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{vignettes} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8}
This vignette contains a detailed explanation of how the package called 'TheCapstonProject' should be used, including examples featuring each function in the package.
```r knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```r library(dplyr) library(ggplot2) library(lubridate) library(magrittr) library(readr) library(TheCapstoneProject)
The eq_clean_data() function takes a raw NOAA data frame from the NOAA website \url{https://www.ngdc.noaa.gov/nndc/struts/form?t = 101650&s = 1&d = 1} and returns a clean data frame. The clean data frame has the following:
````r read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data()
## eq_location_clean The eq_location_clean() function creates an additional column called LOCATION. This is done by stripping out the country name (including the colon) from the LOCATION_NAME column and converting names to title case (as opposed to all caps). The new LOCATION column is needed for annotating visualisations by the geom_timeline_label() and eq_create_label() functions. This eq_location_clean() function can be used in conjunction with the eq_clean_data() function. ````r read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_location_clean()
The geom_timeline() function plots a timeline of earthquakes ranging from xmin to xmax dates with a point for each earthquake. Optional aesthetics include color, size, and alpha (for transparency). The x aesthetic is a date and an optional y aesthetic is a factor indicating some stratification in which case multiple time lines will be plotted for each level of the factor (e.g. country). Here, earthquake magnitude in Richter scale values (EQ_PRIMARY) indicated by the size of points and the number of deaths (TOTAL_DEATHS) is indicated by the colour.
XMin <- lubridate::ymd('2000-01-01') XMax <- lubridate::ymd('2020-01-01') Countries <- c('CHINA', 'GREECE') readr::read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data() %>% dplyr::filter(COUNTRY %in% Countries) %>% ggplot2::ggplot() + geom_timeline(aes(x = DATE, xmin = XMin, xmax = XMax, size = EQ_PRIMARY, color = TOTAL_DEATHS)) + labs(x = 'DATE', color = '# deaths', size = 'Richter scale value') + lims(x = c(XMin, XMax))
Now we set the optional y aesthetic to country.
readr::read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data() %>% dplyr::filter(COUNTRY %in% Countries) %>% ggplot2::ggplot() + geom_timeline(aes(x = DATE, xmin = XMin, xmax = XMax, y = COUNTRY, size = EQ_PRIMARY, color = TOTAL_DEATHS)) + labs(x = 'DATE', y = 'COUNTRY', color = '# deaths', size = 'Richter scale value') + lims(x = c(XMin, XMax))
On top of the timeline drawn by geom_timeline, the geom_timeline_label() function adds annotations. This geom adds a vertical line to each data point with a text annotation (e.g. the location of the earthquake) attached to each line. There is an option to subset to n_max number of earthquakes, where we take the n_max largest (by magnitude) earthquakes. Aesthetics are x, which is the date of the earthquake and label which takes the column name from which annotations will be obtained.
Countries <- c('USA', 'CHINA') readr::read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data() %>% eq_location_clean() %>% dplyr::filter(COUNTRY %in% Countries) %>% ggplot2::ggplot() + geom_timeline_label(aes(x = DATE, xmin = XMin, xmax = XMax, n_max = 5, y = COUNTRY, label = LOCATION, size = EQ_PRIMARY, mag = EQ_PRIMARY, colour = TOTAL_DEATHS)) + ggplot2::labs(x = 'Date', y = 'Country', color = '# deaths', size = 'Richter scale value') + ggplot2::lims(x = c(XMin, XMax))
The eq_map() function takes an argument data containing the cleaned data frame with earthquakes to visualize in an interactive map. The user is able to choose which column is used for the annotation in the pop up with a function argument named Text4Popup. It maps the epicenters (LATITUDE/LONGITUDE) and annotates each point with a pop up window containing annotation data stored in a column of the data frame. Each earthquake is shown with a circle, and the radius of the circle should be proportional to the earthquake's magnitude (EQ_PRIMARY).
Countries <- 'MEXICO' readr::read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data() %>% dplyr::filter(COUNTRY %in% Countries) %>% dplyr::filter(DATE >= XMin) %>% eq_map(Text4Popup = "DATE")
The eq_create_label() function takes the dataset as an argument and creates an HTML label that can be used as the annotation text in the leaflet map (created by the eq_map() function). This function puts together a character string for each earthquake that will show the cleaned location (as cleaned by the eq_location_clean() function), the magnitude (EQ_PRIMARY), and the total number of deaths (TOTAL_DEATHS), with boldface labels for each ("Location", "Magnitude" and "Total deaths"). If an earthquake is missing values for any of these, both the label and the value is skipped for that element of the tag.
readr::read_delim(system.file("extdata", "signif.txt", package = "TheCapstoneProject"), delim = '\t') %>% eq_clean_data() %>% eq_location_clean() %>% dplyr::filter(COUNTRY %in% Countries) %>% dplyr::filter(DATE >= XMin) %>% dplyr::mutate(popup_text = eq_create_label(.)) %>% eq_map(Text4Popup = "popup_text")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.