Appendix: Importing the original datasets {#data}

This section can be a bit obscure. It is only included to make the datasets importing steps transparent. It is important to mention that we converted the datasets from DTA (Stata, closed source format) to Arrow Parquet (cross-language, open-source).

The decision to use Arrow instead of CSV/TSV is that Arrow files are always read with the correct column specification (i.e., a column with values such as "00123" is always read as a string and is never confused as a numeric).

Downloading the original datasets

# updated link 2022-04-20
appfiles_url <- "https://www.wto.org/english/res_e/reser_e/AdvancedGuideFiles.zip"
appfiles_zip <- "00-application-files.zip"
appfiles_dir <- "00-application-files"

if (!file.exists(appfiles_zip)) {
  download.file(appfiles_url, appfiles_zip)
}

if (!dir.exists(appfiles_dir)) {
  unzip(appfiles_zip)
  file.rename("Advanced Guide to TPA", appfiles_dir)
}

Converting the original datasets

# these packages are only used to import the data
library(haven)
library(usethis)

# we only need the dataset for the 3rd application for all the exercises in the book
agtpa_applications <- read_dta("00-application-files/Chapter1/Datasets/Chapter1Application3.dta")
use_data(agtpa_applications, overwrite = T)


pachamaltese/tradepolicy documentation built on March 25, 2024, 1:57 p.m.