library(knitr)
library(QuickUDS)
library(dplyr)
opts_chunk$set(echo=TRUE)
options(knitr.table.format = 'markdown')

This package incorporates the replication dataset for Pemstein, Meserve, and Melton (2010) in its replication data (Pemstein, Meserve, and Melton 2013). This replication data contains the measures of democracy used in the construction of the official Unified Democracy Scores. These measures of democracy are labeled _pmm in the democracy dataset in this package. However, some of these measures (Freedom House, Mainwaring et al, DD/PACL, Polity IV, Polyarchy, PRC, Vanhanen) differ from the original sources (the official data from Freedom House, etc.) in a small number of cases, for a number of reasons having to do with data revisions after 2008 and transcription errors. This vignette documents these differences.

Freedom House

library(QuickUDS)
library(dplyr)
comparison <- democracy %>% 
  prepare_data() %>% 
  filter(year <= 2008, year >=1946)

There are in total r nrow(comparison %>% filter((!is.na(pmm_fh) & is.na(fh_total_reversed)) | (is.na(pmm_fh) & !is.na(fh_total_reversed)) | (pmm_fh != fh_total_reversed))) differences between PMM's replication data and the current Freedom House data (Freedom House 2017). These differences seem to be due to data revisions since 2010 in the Freedom House dataset, plus the inclusion of non-state territories in this dataset, and the treatment of 1981, which is conventionally taken to be "missing data" in the Freedom House series but which appears to be interpolated in PMM's replication dataset.

kable(comparison %>% 
        filter((!is.na(pmm_fh) & is.na(fh_total_reversed)) | 
                 (is.na(pmm_fh) & !is.na(fh_total_reversed)) | 
                 (pmm_fh != fh_total_reversed)) %>%
        group_by(extended_country_name, pmm_fh, fh_total_reversed) %>%
        summarise(years = paste(year,collapse=", "), n = n()) %>%
        arrange(years),
      caption = "Differences between PMM's FH values and current FH data",
      col.names = c("Country name","Freedom House value in PMM",
                    "Latest Freedom House data","years affected","n"))

The Mainwaring et al. dataset

PMM's replication data is missing r nrow(comparison %>% filter((!is.na(pmm_mainwaring) & is.na(mainwaring)) | (is.na(pmm_mainwaring) & !is.na(mainwaring)) | (pmm_mainwaring != mainwaring))) the data points in the original data by Mainwaring et al (Mainwaring, Brinks, and Perez Linan 2008), but the original data is not missing any of their data points, and there are no differences between the data points wherever both the original and the replication data have values.

kable(comparison %>% 
        filter((!is.na(pmm_mainwaring) & is.na(mainwaring)) | 
                 (is.na(pmm_mainwaring) & !is.na(mainwaring)) | 
                 (pmm_mainwaring != mainwaring)) %>%
        group_by(extended_country_name, pmm_mainwaring, mainwaring) %>%
        summarise(min = min(year), max= max(year), n = n()),
      caption = "Differences between PMM's Mainwaring et al values and original Mainwaring et al data",
      col.names = c("Country name","Mainwaring et al value in PMM",
                    "Mainwaring et al value in original dataset","min year", "max year","n"))

The DD/PACL/ACLP dataset

r nrow(comparison %>% filter((!is.na(pmm_pacl) & is.na(pacl)) | (is.na(pmm_pacl) & !is.na(pacl)) | (pmm_pacl != pacl))) country-years in the original PACL/DD dataset (Cheibub, Gandhi, and Vreeland 2010) are missing from PMM's replication dataset.

kable(comparison %>% 
        filter((!is.na(pmm_pacl) & is.na(pacl)) | 
                 (is.na(pmm_pacl) & !is.na(pacl)) | 
                 (pmm_pacl != pacl)) %>%
        group_by(extended_country_name, pmm_pacl, pacl) %>%
        summarise(min = min(year), max= max(year), n = n()),
      caption = "Differences between PMM's PACL values and original PACL data",
      col.names = c("Country name","PACL value in PMM",
                    "Original PACL data","min year", "max year","n"))

The Polity IV dataset

There are r nrow(comparison %>% filter((!is.na(pmm_polity) & is.na(polity2)) | (is.na(pmm_polity) & !is.na(polity2)) | (pmm_polity != polity2))) country-years where PMM's replication data differ from the latest version of Polity IV, either because they have no data, or because they have a different value. These differences seem to be due to minor data revisions since 2010 in the Polity IV dataset.

kable(comparison %>% 
        filter((!is.na(pmm_polity) & is.na(polity2)) | 
                 (is.na(pmm_polity) & !is.na(polity2)) | 
                 (pmm_polity != polity2)) %>%
        mutate_at(vars(matches("polity")), funs(. - 11)) %>%
        group_by(extended_country_name, pmm_polity, polity2) %>%
        summarise(min = min(year), max= max(year), n = n()),
      caption = "Differences between PMM's Polity2 values and latest Polity IV data",
      col.names = c("Country name","Polity2 value in PMM",
                    "Latest Polity IV data","min year", "max year","n"))

The Polyarchy dataset

r nrow(comparison %>% filter((!is.na(pmm_polyarchy) & is.na(polyarchy_original_polyarchy)) | (is.na(pmm_polyarchy) & !is.na(polyarchy_original_polyarchy)) | (pmm_polyarchy != polyarchy_original_polyarchy))) country-years differ between PMM's replication data and the original Polyarchy dataset (Coppedge and Reinicke 1991). These seem to be due to simple transcription errors.

kable(comparison %>% 
        filter((!is.na(pmm_polyarchy) & is.na(polyarchy_original_polyarchy)) | 
                 (is.na(pmm_polyarchy) & !is.na(polyarchy_original_polyarchy)) | 
                 (pmm_polyarchy != polyarchy_original_polyarchy)) %>%
        group_by(extended_country_name, 
                 pmm_polyarchy,
                 polyarchy_original_polyarchy) %>%
        summarise(min = min(year), max= max(year), n = n()),
      caption = "Differences between PMM's Polyarchy values and original Polyarchy data",
      col.names = c("Country name","Polyarchy value in PMM",
                    "Original Polyarchy data","min year", "max year","n"))

The Political Regime Change (PRC) dataset

The PRC dataset (Gasiorowski 1996, revised in Reich 2002) has more than one value per country-year for some country-years due to transitions between regimes; and these transitions are not consistently treated in PMM's replication dataset (sometimes the value for the beginning of the year is used, sometimes the value for the end of the year, and sometimes the value for the middle of the year). This results in r nrow(comparison %>% mutate(prc = ifelse(prc == 1, prc, prc + 1)) %>% filter((!is.na(pmm_prc) & is.na(prc)) | (is.na(pmm_prc) & !is.na(prc)) | (pmm_prc != prc))) differences between the datasets. In this package, I use the last value in a given year for the regime in the PRC dataset, and code "transitions" (prc = 2) as missing values (NA), which results in the following changes:

kable(comparison %>% 
        mutate(prc = ifelse(prc == 1, prc, prc + 1)) %>%
        filter((!is.na(pmm_prc) & is.na(prc)) | 
                 (is.na(pmm_prc) & !is.na(prc)) | 
                 (pmm_prc != prc)) %>%
        group_by(extended_country_name, pmm_prc, prc) %>%
        summarise(years = paste(year,collapse=", "), n = n()),
      caption = "Differences between PMM's PRC values and original PRC data",
      col.names = c("Country name","PRC value in PMM",
                    "Original PRC data","years affected","n"))

Vanhanen dataset

There are r nrow(comparison %>% filter((!is.na(pmm_vanhanen) & is.na(vanhanen_democratization)) | (is.na(pmm_vanhanen) & !is.na(vanhanen_democratization)) | (pmm_vanhanen != vanhanen_democratization))) missing values in PMM's data compared to the original Vanhanen dataset (Vanhanen 2012):

comparison2 <- democracy %>% 
  select(extended_country_name, GWn, year, vanhanen_democratization, pmm_vanhanen) %>% 
  filter(year <= 2008, year >=1946)

kable(comparison2 %>% 
        filter((!is.na(pmm_vanhanen) & is.na(vanhanen_democratization)) | 
                 (is.na(pmm_vanhanen) & !is.na(vanhanen_democratization)) | 
                 (pmm_vanhanen != vanhanen_democratization)) %>%
        group_by(extended_country_name,pmm_vanhanen,vanhanen_democratization) %>%
        summarise(years = paste(year,collapse=", "), n = n()),
      caption = "Differences between PMM's Vanhanen values and original Vanhanen data",
      col.names = c("Country name","Vanhanen value in PMM",
                    "Original Vanhanen data","years affected","n"))

References

Cheibub, Jose Antonio, Jennifer Gandhi, and James Raymond Vreeland. 2010. "Democracy and Dictatorship Revisited." Public Choice. 143(1):67-101. Original data available at https://sites.google.com/site/joseantoniocheibub/datasets/democracy-and-dictatorship-revisited.

Coppedge, Michael and Wolfgang H. Reinicke. 1991. Measuring Polyarchy. In On Measuring Democracy: Its Consequences and Concomitants, ed. Alex Inkeles. New Brunswuck, NJ: Transaction pp. 47-68.

Freedom House. 2016. "Freedom in the World." Original data available at http://www.freedomhouse.org.

Gasiorowski, Mark J. 1996. "An Overview of the Political Regime Change Dataset." Comparative Political Studies 29(4):469-483.

Mainwaring, Scott, Daniel Brinks, and Anibal Perez Linan. 2008. "Political Regimes in Latin America, 1900-2007." Original data available from http://kellogg.nd.edu/scottmainwaring/Political_Regimes.pdf.

Marshall, Monty G., Ted Robert Gurr, and Keith Jaggers. 2012. "Polity IV: Political Regime Characteristics and Transitions, 1800-2012." Updated to 2015. Original data available from http://www.systemicpeace.org/polity/polity4.htm.

Pemstein, Daniel, Stephen Meserve, and James Melton. 2010. Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type. Political Analysis 18 (4): 426-449.

Pemstein, Daniel, Stephen A. Meserve, and James Melton. 2013. "Replication data for: Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type." In: Harvard Dataverse. http://hdl.handle.net/1902.1/PMM

Reich, G. 2002. Categorizing Political Regimes: New Data for Old Problems. Democratization 9 (4): 1-24. http://www.tandfonline.com/doi/pdf/10.1080/714000289.

Vanhanen, Tatu. 2012. "FSD1289 Measures of Democracy 1810-2012." Original data available from http://www.fsd.uta.fi/english/data/catalogue/FSD1289/meF1289e.html



xmarquez/QuickUDS documentation built on May 4, 2019, 1:24 p.m.