knitr::opts_chunk$set(echo = TRUE,message = FALSE, warning = FALSE, cache = FALSE, out.width = "100%")
compiled at
r Sys.time()
Download data from the Quality of Government Institute data
Quotation from Quality of Governance institute website
The QoG Institute was founded in 2004 by Professor Bo Rothstein and Professor Sören Holmberg. It is an independent research institute within the Department of Political Science at the University of Gothenburg. We conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government (QoG) - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.
The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty. We approach these problems from a variety of different theoretical and methodological angles.
Quality of Government institute provides data in five different data sets, both in cross-sectional and longitudinal versions:
rqog-package provides access to Basic, Standard and OECD datasets through function read_qog()
. Standard data has all the same indicators as in Basic data (367 variables) and an additional ~1600 indicators. Both basic and standard datasets have 194 countries. OECD dataset has 1020 indicators from 35 countries. rqog uses longitudinal datasets by default that have time-series of varying duration from majority of the indicators and countries.
Quality of Government Institute provides codebooks for all datasets:
You consult the codebooks for description of the data and indicators.
library(devtools) install_github("ropengov/rqog") library(rqog)
Basic Data
Basic data has a selection of most common indicators, 344 indicators from 211 countries. Below is an example on how to extract data on population and Democracy (Freedom House/Polity) index from BRIC-countries from 1990 to 2010 and to plot it.
library(rqog) library(dplyr) library(ggplot2) library(tidyr) # Download a local coppy of the file basic <- read_qog(which_data="basic", data_type = "time-series") # Subset the data dat.l <- basic %>% # filter years and countries filter(year %in% 1990:2010, cname %in% c("Russia","China","India","Brazil")) %>% # select variables select(cname,year,p_polity2,wdi_pop1564) %>% # gather to long format gather(., var, value, 3:4) %>% # remove NA values filter(!is.na(value)) # Plot ggplot(dat.l, aes(x=year,y=value,color=cname)) + geom_point() + geom_line() + geom_text(data = dat.l %>% group_by(cname) %>% filter(year == max(year)), aes(x=year,y=value,label=cname), hjust=1,vjust=-1,size=3,alpha=.8) + facet_wrap(~var, scales="free") + theme_minimal() + theme(legend.position = "none") + labs(title = "Plotting QoG basic data", caption = "Data: QoG Basic data")
Standard data
Standard data includes 2190 indicators from 211 countries. Below is an example on how to extract data on Economic Performance and GINI index (World Bank estimate) from BRIC-countries and plot it.
library(rqog) # Download a local coppy of the file standard <- read_qog("standard", "time-series") # Subset the data dat.l <- standard %>% # filter years and countries filter(year %in% 1990:2020, cname %in% c("Russia","China","India","Brazil")) %>% # select variables select(cname,year,bti_ep,wdi_gini) %>% # gather to long format gather(., var, value, 3:4) %>% # remove NA values filter(!is.na(value)) # Plot the data # Plot ggplot(dat.l, aes(x=year,y=value,color=cname)) + geom_point() + geom_line() + geom_text(data = dat.l %>% group_by(cname) %>% filter(year == max(year)), aes(x=year,y=value,label=cname), hjust=1,vjust=-1,size=3,alpha=.8) + facet_wrap(~var, scales="free") + theme_minimal() + theme(legend.position = "none") + labs(title = "Plotting QoG Standard data", caption = "Data: QoG Standard data")
OECD data
OECD data includes 1006 variables, but from a smaller number of wealthier countries of 36. In the example below four indicators:
oecd_pphlthxp_t1c
wdi_gini
oecd_natinccap_t1
oecd_govdebt_t1
We will include all the countries and all the years included in the data.
library(rqog) # Download a local coppy of the file oecd <- read_qog("oecd", "time-series") # Subset the data dat.l <- oecd %>% # select variables select(cname,year,oecd_pphlthxp_t1c,wdi_gini,oecd_natinccap_t1,oecd_govdebt_t1) %>% # gather to long format gather(., var, value, 3:6) %>% # remove NA values filter(!is.na(value)) # Plot the data # Plot ggplot(dat.l, aes(x=year,y=value,color=cname)) + geom_point() + geom_line() + geom_text(data = dat.l %>% group_by(var,cname) %>% filter(year == max(year)), aes(x=year,y=value,label=cname), hjust=1,vjust=-1,size=3,alpha=.8) + facet_wrap(~var, scales="free") + theme_minimal() + theme(legend.position = "none") + labs(title = "Plotting QoG OECD data", caption = "Data: QoG OECD data")
Packages is shipped with seven metadatas for each year (2016-2022) meta_basic_cs_2022
, meta_basic_ts_2022
, meta_std_cs_2022
, meta_std_ts_2022
, meta_oecd_cs_2022
and meta_oecd_ts_2022
. Data frames are generated from original spss
versions of data using tidymetadata::create_metadata()
-function.
Browsing metadata
You can browse the content by applying grepl
to name
column. Let's find indicators containing term Corruption
either in lower or uppercase.
library(rqog) meta_basic_ts_2022[grepl("Corruption", meta_basic_ts_2022$name, ignore.case = TRUE),]
Assigning labels to values with metadata
The data rqoq
imports to R is in .csv
-format without the labels and names shipped together with spss
or Stata
formats. As such it is the desired format to work with in R, especially with numeric indicators. However, many of the indicators in QoG are factors meaning that they have discrete values with a corresponding label. You can use the metadatas to assign labels for values of such indicators. Lets take the ccp_cc
as an example below and first print the value
and label
colums of the data.
meta_basic_ts_2022 %>% filter(code == "ccp_cc") %>% select(value,label)
Currently we have basic data in R in an object called basic
. Lets see the frequencies of each value
basic %>% count(ccp_cc)
Now, using the metadata with assign values with corresponding labels
basic %>% count(ccp_cc) %>% mutate(ccp_cc_lab = meta_basic_ts_2022[meta_basic_ts_2022$code == "ccp_cc",]$label[match(ccp_cc,meta_basic_ts_2022[meta_basic_ts_2022$code == "ccp_cc",]$value)])
So, lets find two factor variables with few more values from the cross-sectional data
meta_basic_cs_2022 %>% filter(class =="factor") %>% group_by(code) %>% summarise(n_of_values = n()) %>% arrange(desc(n_of_values))
Lets take these two factors and summarise the regime types per regions
meta_basic_cs_2022 %>% filter(code %in% c("ht_region","ht_colonial")) %>% distinct(code, .keep_all = TRUE)
# lets download the cross-sectional data first basic_cs <- read_qog(which_data = "basic", data_type = "cross-sectional") plot_d <- basic_cs %>% # group by region group_by(ht_region) %>% # count per group frequencies of each regime type count(ht_colonial) %>% ungroup() %>% # label mutate(ht_region_lab = meta_basic_ts_2022[meta_basic_ts_2022$code == "ht_region",]$label[match(ht_region,meta_basic_ts_2022[meta_basic_ts_2022$code == "ht_region",]$value)], ht_colonial_lab = meta_basic_ts_2022[meta_basic_ts_2022$code == "ht_colonial",]$label[match(ht_colonial,meta_basic_ts_2022[meta_basic_ts_2022$code == "ht_colonial",]$value)]) %>% na.omit() head(plot_d) # lets abbreviate option '0. Never colonized by a Western overseas colonial power' to '0. Never' plot_d$ht_colonial_lab[plot_d$ht_colonial_lab == "0. Never colonized by a Western overseas colonial power"] <- '0. Never'
Then we can create a simple bar plot
# indicators names from metadata ind_name <- unique(meta_basic_cs_2022[meta_basic_cs_2022$code == "ht_colonial",]$name) group_name <- unique(meta_basic_cs_2022[meta_basic_cs_2022$code == "ht_region",]$name) ggplot(plot_d, aes(x=ht_colonial_lab,y=n)) + geom_col() + facet_wrap(~ht_region_lab, scales = "free") + theme_minimal() + theme(axis.text.x = element_text(angle = 90, size = 7)) + labs(title = paste0(ind_name," by ",group_name ), caption = "Data: Quality of Government institute", x = NULL, y = "number of countries") + coord_flip()
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.