knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-", fig.retina = 2, message = FALSE, width = 120, warning = FALSE )
This provides the datasets from the ICIJ Offshore Leaks Database from the Panama Papers. This dataset was constructed by the International Consortium of Investigative Journalists and available here under the Creative Commons Attribution-ShareAlike license. Besides that data, this package is available under GPL-3.
Future versions of this package may soon include other data from the Panama Papers besides the offshore leaks dataset.
Install using devtools:
devtools::install_github("dgrtwo/rpanama")
There are five main datasets that this package provides. First is information on the Entities, Intermediaries, Officers, and Addresses from the offshore leaks dataset (see here for the definitions of each):
library(rpanama) library(dplyr) Entities Intermediaries Officers Addresses
These each make up nodes in a network. The all_edges
dataset contains connections between these, including which entities have which offshore accounts.
all_edges all_edges %>% count(rel_type, sort = TRUE)
Each of the five above datasets comes directly from the CSVs released by the ICIJ here.
We could find which countries have the most offshore entities (note that there are multiple countries split by ;
):
library(tidyr) library(stringr) most_common_countries <- Entities %>% select(node_id, countries) %>% unnest(country = str_split(countries, ";")) %>% count(country, sort = TRUE) most_common_countries
We may also find inter-country intermediary relationships were most common. We could do this by joining the Intermediaries
data with the all_edges
data and then the Entities
data:
entity_countries <- Entities %>% unnest(entity_country = str_split(countries, ";"), entity_code = str_split(country_codes, ";")) %>% select(node_id, entity_country, entity_code) connections <- Intermediaries %>% unnest(intermediary_country = str_split(countries, ";"), intermediary_code = str_split(country_codes, ";")) %>% inner_join(all_edges, by = c(node_id = "node_1")) %>% inner_join(entity_countries, by = c(node_2 = "node_id")) %>% filter(rel_type == "intermediary of", intermediary_country != entity_country) %>% count(intermediary_country, entity_country, intermediary_code, entity_code) %>% ungroup() %>% arrange(desc(n)) connections
We could even put this on a map:
library(rworldmap) library(geosphere) country_centers <- getMap()@data %>% tbl_df() %>% select(code = ISO_A3, longitude = LON, latitude = LAT) great_circles <- connections %>% filter(n > 100) %>% mutate(connection = row_number()) %>% select(intermediary_code, entity_code, n, connection) %>% gather(type, code, contains("code")) %>% inner_join(country_centers, by = "code") %>% group_by(connection, n) %>% arrange(desc(type)) %>% filter(n() == 2) %>% do(as.data.frame(gcIntermediate(c(.$longitude[1], .$latitude[1]), c(.$longitude[2], .$latitude[2])))) library(ggplot2) library(ggthemes) library(ggalt) world_map <- filter(map_data("world"), region != "Antarctica") ggplot() + geom_map(data = world_map, map = world_map, aes(x = long, y = lat, map_id = region), color = "#b2b2b2", size = 0.15, fill = NA) + geom_path(data=great_circles, color = "#a50026", aes(lon, lat, group = connection, alpha = n), arrow = arrow(length = unit(0.1, "inches"))) + scale_color_gradient2(low = "white", high = "#a50026", trans = "log") + coord_proj() + labs(title = "Most common offshore entity relationships") + theme_map(base_family="Arial") + #scale_size_continuous(trans = "log", range = c(.1, 1)) + theme(legend.position = "none", plot.title=element_text(hjust = 0.5, size = 14))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.