knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

WEED (Wrangler for Emergency Events Database)

The goal of weed is to make the analysis of EM-DAT and related datasets easier, with most of the pre-processing abstracted away by functions in this package!

Pre Requisites

Installation of the following packages : readxl, dplyr, magrittr, tidytext, stringr, tibble, geonames, countrycode, purrr, tidyr, forcats, ggplot2, sf, rgeos.

You also need a geonames user account if you intend to use the geocoding functionality of this package. Info on how to get one for free is available here.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("rammkripa/weed")

Example

This is a basic example which shows a common weed workflow:

library(weed)
library(tidyverse)

Loading the Data

em <- read_emdat("/Users/ramkripa/Desktop/Tk2.xlsx", file_data = TRUE)

Locationizing the Data

locationized_data <- em$disaster_data %>%
  tail() %>%
  split_locations(column_name = "Location") %>%
  head()
locationized_data %>%
  select(`Dis No`, Location,location_word, Latitude, Longitude, uncertain_location_specificity)

There are two problems with the Dataset as it exists here.

  1. Half of our observations, even in this toy dataset, don't have Lat/Long data

  2. The Lat/Long here is blatantly wrong.

Lat > 90? Long > 360? How is this possible?

So, we must recode this Lat/Long data

Solving Problem 1: Our locations have very little Lat/Long data

locationized_data %>%
  percent_located_locations(lat_column = "Latitude",
                            lng_column = "Longitude")

Solving Problem 2: Geocoding the Locationized Data

A reminder that you need a geonames username to access this feature of the weed package.

More info available here.

dummy_name = "rammkripa"
geocoded_data <- locationized_data %>%
  geocode(geonames_username = dummy_name)
geocoded_data %>%
  select(`Dis No`, Location,location_word, lat, lng)

Side note: These Lat/Long data look much better than before, given that Kenya is close to the equator!

How effective was our geocoding?

geocoded_data %>%
  percent_located_locations()
geocoded_data %>%
  percent_located_disasters()

Check if the locations are in a Lat Long Box!!

geocoded_data <- geocoded_data %>%
  located_in_box(top_left_lat = 0, 
                 top_left_lng = 35, 
                 bottom_right_lat = -6, 
                 bottom_right_lng = 40) 
geocoded_data %>%
  select(`Dis No`, Location,location_word, lat, lng, in_box)

Check if Locations are in Shapefile!

s_file_name = "~/Desktop/Projects/emdat_proj/shape_data/SH_mask.shp"
geocoded_data %>%
  located_in_shapefile(shapefile_name = s_file_name) %>%
  select(`Dis No`, Location, location_word, lat, lng, in_box, in_shape)

Want to re-nest the location data?

geocoded_data %>%
  nest_locations() %>%
  select(`Dis No`, location_data)


rammkripa/weed documentation built on Oct. 23, 2023, 6:39 a.m.