This package contains data and convenience functions that can be used to replicate and extend the Unified Democracy Scores of Pemstein, Meserve, and Melton (2010). It depends on the package mirt (Chalmers 2012), which can quickly compute full-information factor analyses.

The basic procedure for replicating or extending the UD scores is as follows. First, load the library and prepare the democracy data using the convenience functions prepare_data or prepare_democracy; fit a simple mirt model; and finally, extract the scores. These procedures can be easily condensed into a single call to democracy_scores, but this vignette shows how to use a longer procedure.

library(knitr)
opts_chunk$set(echo=TRUE)

Preparing your democracy indexes

Though you can use your own custom democracy indexes to generate latent (unified) democracy scores, the package provides a dataset, democracy, which contains a large number of democracy measures (from more than 30 different datasets, of varying reliability and coverage) that you can use to generate custom unified democracy scores. (Use ?democracy to view which measures are included in the dataset as well as links to their sources; they are also listed in the references below). For example, suppose we wanted to replicate exactly the 2011 release of the UD scores. PMM's replication data for that release is included in the democracy dataset: we just need to select the variable names ending in _pmm and use the function prepare_democracy

library(QuickUDS)
library(mirt)
library(dplyr)
library(tidyr)

names(democracy)
indexes <- names(democracy)[grep("pmm",names(democracy))]
indexes # The measures of democracy used to generate the 2011 release of the UDS
data <- prepare_democracy(indexes)

You can use a dplyr::select helper to simplify this procedure:

data <- prepare_democracy(matches("pmm"))

The function prepare_democracy selects the appropriate colums from the democracy dataset, gets rid of "empty rows" (country-years that have no measurements of democracy for the chosen indexes; such patterns will make mirt fail), and runs prepare_data on the remaining indexes.

prepare_data transforms selected democracy indexes into ordinal variables suitable for use with mirt, mostly following the advice in Pemstein, Meserve, and Melton's original article (2010). In particular, prepare_data will try to do the following on any dataset (whether the included democracy dataset or your own dataset of democracy scores):

prepare_data will also work on column names that contain the following strings:

In each of these cases the function prepare_data transforms the values of the scores by running as.numeric(unclass(factor(x))), which transforms each index into ordinal variables from 1 to the number of categories.

If you are using democracy indexes not included in the democracy dataset, or want to use your own custom measures of democracy, or transform them in a very particular way, you simply need to ensure that there are no "blank" country-years in your data (i.e., country-years without any democracy measurements) and that the indexes you are using are ordinal measures from 1 to N with every category present in the data. mirt is pretty flexible and forgiving: it will accept ordinal variables in any range and will attempt to transform your indexes so that every category is within a distance of 1 of its neighboring categories. But it's useful to have a good sense of what the algorithm is doing to your data before you begin!

Fitting a democracy model

You then fit the model as follows:

replication_2011_model <- mirt(data[ ,  indexes], model = 1, 
                               itemtype = "graded", SE = TRUE, 
                               verbose = FALSE)

This just tells mirt to fit a one-factor, full information graded response model like that in Pemstein, Meserve, and Melton 2010, and to calculate the standard errors for the coefficients. (See ?mirt for details of the many options you can use to tweak your model).

Fitting this model is reasonably fast:

replication_2011_model@time

There is a convenient wrapper for the entire procedure:

replication_2011_model <- democracy_model(matches("pmm"), verbose = FALSE)

We can easily check that this model converges and that accounts for most of the variance in the democracy indexes:

replication_2011_model
summary(replication_2011_model)

And we can then extract the latent democracy scores, either via the mirt package's fscore function, or via this package's convenient wrapper democracy_scores (which returns a tidy dataset with the latent scores, binds them to the appropriate country-years and automatically calculates 95\% confidence intervals):

replication_2011_scores <-  fscores(replication_2011_model, 
                                    full.scores = TRUE, 
                                    full.scores.SE = TRUE)
# Not a data frame, no country-years:
str(replication_2011_scores)

replication_2011_scores <- democracy_scores(model = replication_2011_model)

# A data frame with confidence intervals:
replication_2011_scores

The entire procedure can be performed in one call:

replication_2011_scores <- democracy_scores(matches("pmm"), verbose = FALSE)

We can also check that these scores are, in fact, perfectly correlated with Pemstein, Meserve, and Melton's 2011 UDS release:

cor(replication_2011_scores %>% select(matches("uds"), z1), use = "pairwise")

(For more details on the relationship between the original UD scores and the replicated scores produced using this method, see my working paper Marquez 2016).

Extending the model

Now suppose you want to create a new Unified Democracy score derived from Pemstein, Meserve, and Melton's original source measures but that also maximizes the democracy information available in other datasets, including:

This package makes the process extremely simple:

indexes <- c("pmm_arat","blm","bmr_democracy_femalesuffrage",
             "pmm_bollen","doorenspleet","wgi_democracy","fh_total_reversed",
             "gwf_democracy","pmm_hadenius","kailitz_tri","lexical_index",
             "mainwaring","pmm_munck","pacl",
             "polity2","polyarchy_original_contestation","prc", 
             "svolik_democracy","ulfelder_democracy",
             "v2x_polyarchy", "vanhanen_democratization","wth_democ1") 

extended_model <- democracy_model(indexes, verbose = FALSE)

summary(extended_model)

extended_scores <- democracy_scores(model = extended_model)

mirt will stop by default after 500 EM cycles, but some models will take longer to converge. If your model has not converged after 500 iterations of the algorithm, you can try increasing the number of cycles with the technical option.

Use ?mirt for more details.

One important point to note about latent variable democracy scores is that they are normalized with mean zero and standard deviation one, so a score of 1 just means that the country-year is 1 standard deviation more democratic than the average country-year in the sample. But this means that adding extra country-years to our model will typically result in scores that have a higher mean (though usually smaller standard errors) than the original UD model, given that the world has become substantially more democratic over the last two centuries:

countries <- c("United States of America", 
               "United Kingdom","Argentina",
               "Chile","Venezuela","Spain")

data <- extended_scores %>% 
  filter(extended_country_name %in% countries) %>%
  tidyr::gather(measure, zscore, uds_2010_mean:uds_2014_mean, z1) %>%
  filter(!is.na(zscore), year >=1946, year < 2008) %>%
  mutate(measure = plyr::mapvalues(measure, 
                                   c("uds_2010_mean",
                                     "uds_2011_mean",
                                     "uds_2014_mean",
                                     "z1_matched"),
                                   c("2010 release of UDS",
                                     "2011 release of UDS",
                                     "2014 release of UDS",
                                     "Replicated score")))

library(ggplot2)

ggplot(data = data, 
       aes(x = year, y = zscore, color = measure)) + 
  geom_path() + 
  theme_bw() + 
  labs(x = "Year", y = "Latent unified democracy scores,\nper year")  + 
  theme(legend.position="bottom") + 
  guides(color = guide_legend(ncol = 1),fill = guide_legend(nrow = 1)) + 
  facet_wrap(~extended_country_name, ncol = 2)  

If necessary, one can therefore "match" the extended scores to the official UD release by substracting the mean of the extended scores for the period of the UD release one wants to match from the extended scores (that is, making the mean of the extended scores equal to zero for the period of adjustment):

data <- extended_scores %>% 
  filter(!is.na(uds_2014_mean)) %>%
  mutate(z1_matched = z1 - mean(z1, na.rm = TRUE), 
         z1_pct975_matched = z1_pct975 - mean(z1, na.rm = TRUE), 
         z1_pct025_matched = z1_pct025 - mean(z1, na.rm = TRUE))

data <- data %>% 
  filter(extended_country_name %in% countries) %>%
  tidyr::gather(measure, zscore, uds_2010_mean:uds_2014_mean, z1_matched) %>%
  filter(!is.na(zscore), year >=1946, year < 2008) %>%
  mutate(measure = plyr::mapvalues(measure, 
                                   c("uds_2010_mean",
                                     "uds_2011_mean",
                                     "uds_2014_mean",
                                     "z1_matched"),
                                   c("2010 release of UDS",
                                     "2011 release of UDS",
                                     "2014 release of UDS",
                                     "Matched replicated score")))

library(ggplot2)

ggplot(data = data, 
       aes(x = year, y = zscore, color = measure)) + 
  geom_path() + 
  theme_bw() + 
  labs(x = "Year", y = "Latent unified democracy scores,\nper year")  + 
  theme(legend.position="bottom") + 
  guides(color = guide_legend(ncol=1),fill = guide_legend(nrow=1)) + 
  facet_wrap(~extended_country_name,ncol=2)  

In the graph above, we can see that the 2014 release of the UDS seems to overestimate the degree of democracy in the USA in the early decades of 1950 relative to the "extended" scores.

These scores have a more natural interpretation when transformed to a 0-1 index using the cumulative distribution function as the "probability that a country-year is democratic" (so the 0 is now a natural minimum, and 1 a natural maximum). These indexes are automatically produced by the function democracy_scores; they are in the column z1_as_prob of the output, and are produced by applying the pnorm function to z1, as follows:

extended_scores <- extended_scores %>% 
  mutate(index = pnorm(z1), 
         index_pct025 = pnorm(z1_pct025), 
         index_pct975 = pnorm(z1_pct975))

# These are equal to z1_as_prob, which is automatically calculated:

extended_scores %>% filter(index != z1_as_prob)

It is also possible to to set the cutpoint for this index at, for example, the average cutpoint in the latent variable of the dichotomous indexes of democracy (so that 0.5 correponds more naturally to the point at which a regime could be either democratic or non-democratic according to the dichotomous measures of democracy included in your model). These scores are also automatically calculated (they are in the column z1_adj) but they can also be manuallyadded as follows:

cutpoints_extended <- cutpoints(extended_model)

cutpoints_extended

dichotomous_cutpoints <- cutpoints_extended %>% 
  group_by(variable) %>%
  filter(n() == 1) 

dichotomous_cutpoints

avg_dichotomous <- mean(dichotomous_cutpoints$estimate)

avg_dichotomous

extended_scores <- extended_scores %>% mutate(adj_z1 = z1 - avg_dichotomous, 
                                        adj_pct025 = z1_pct025 - avg_dichotomous, 
                                        adj_pct975 =z1_pct975 - avg_dichotomous,
                                        index = pnorm(adj_z1),
                                        index_pct025 = pnorm(adj_pct025),
                                        index_pct975 = pnorm(adj_pct975))

ggplot(data = extended_scores %>% filter(extended_country_name %in% countries), 
       aes(x= year, y = index, 
           ymin = index_pct025, ymax = index_pct975)) + 
  geom_path() + 
  geom_ribbon(alpha=0.2) + 
  theme_bw() + 
  labs(x = "Year", y = "Latent unified democracy scores,\nper year\nconverted to 0-1 probability scale")  + 
  theme(legend.position="bottom") + 
  guides(color = guide_legend(ncol=1),fill = guide_legend(nrow=1)) + 
  geom_hline(yintercept=0.5,color="red") +
  facet_wrap(~extended_country_name,ncol=2)  

A pre-computed version of the extended UDS scores, with data from all the indexes mentioned above, plus the participation-enhanced Polity Scores of Moon et al (2006), a trichotomous democracy indicator calculated from Magaloni, Min, and Chu's "Autocracies of the World" datset (Magaloni, Min and Chu 2013), a dichotomous democracy indicator calculated from Hsu (2008), the REIGN dataset of Bell (2016), and an indicator of democracy used by the Political Instability Task Force (Goldstein et. al 2010; Taylor and Ulfelder 2015), is included with the package; it can be loaded by simply typing extended_uds. Use ?extended_uds to examine the documentation for all its variables; see my working paper (Marquez 2016) for more detail on the data and its uses.

Extracting useful information from the model

The mirt package offers a great number of powerful tools to examine and diagnose the fitted model, including functions to extract model cutpoints and item information curves. But this package also contains two convenience functions that wrap mirt tools to quickly extract democracy rater discrimination parameters, rater cutoffs, and rater information curves from a model produced by democracy_model in a tidy data frame format suitable for graphing:

replication_2011_cutpoints <- cutpoints(replication_2011_model, type ="score")
replication_2011_cutpoints

# We plot the "normalized" cutpoints ("estimate," in the same scale as the latent scores), 
# not the untransformed ones ("par")

ggplot(data = replication_2011_cutpoints, 
       aes(x = variable, y = estimate, 
           ymin = pct025, ymax = pct975))  + 
  theme_bw() + 
  labs(x="",y="Unified democracy level rater cutoffs") + 
  geom_point() + 
  geom_errorbar() + 
  geom_hline(yintercept =0, color="red") + 
  coord_flip()

# We can also plot discrimination parameters, which are in a different scale:
replication_2011_discrimination <- cutpoints(replication_2011_model, 
                                             type ="discrimination")

replication_2011_discrimination

ggplot(data = replication_2011_discrimination, 
       aes(x=reorder(variable,estimate),
           y = estimate, ymin = pct025, 
           ymax = pct975))  + 
  theme_bw() + 
  labs(x="",y="Discrimination parameter for each rater
       \n(higher value means fewer idiosyncratic\nerrors relative to latent score)") + 
  geom_point() + 
  geom_errorbar() + 
  coord_flip()

# And we can plot item information curves for each rater:
replication_2011_info <- raterinfo(replication_2011_model)
replication_2011_info

ggplot(data = replication_2011_info, aes(x=theta,y=info)) + 
  geom_path() + 
  facet_wrap(~rater) + 
  theme_bw() + 
  labs(x="Latent democracy score",y = "Information") + 
  theme(legend.position="bottom")

Finally, the package offers a simple function to estimate the probability that a given country is more democratic than another in a given year, accounting for the uncertainty in the UD-style measures. For example, suppose we want to know the probability that the USA was more democratic than France in the year 2000 for both the replicated 2011 scores and our extended model:

prob_more(replication_2011_scores, "United States of America","France", 2000)
prob_more(extended_scores, "United States of America","France", 2000)

Or perhaps we wish to know the probability that the United States was more democratic in the year 2000 than in the year 1953:

prob_more(replication_2011_scores, 
          "United States of America",
          "United States of America", 
          c(2000,1953))
prob_more(extended_scores, 
          "United States of America",
          "United States of America", 
          c(2000,1953))

If you find this package useful, cite both it and Pemstein, Meserve, and Melton's original paper (Pemstein Meserve and Melton 2010). Citation information can be automatically generated by using citation:

citation("QuickUDS")

References

Anckar, C. and C. Fredriksson. 2018. "Classifying political regimes 1800-2016: a typology and a new dataset". European Political Science. DOI: 10.1057/s41304-018-0149-8. https://doi.org/10.1057/s41304-018-0149-8

Arat, Zehra F. 1991. Democracy and human rights in developing countries. Boulder: Lynne Rienner Publishers.

Bell, Curtis. 2016. The Rulers, Elections, and Irregular Governance Dataset (REIGN). 2016. http://oefresearch.org/datasets/reign.

Boix, Carles, Michael Miller, and Sebastian Rosato. 2012. A Complete Data Set of Political Regimes, 1800–2007. Comparative Political Studies 46 (12): 1523-1554. DOI:10.1177/0010414012463905. Original data available at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FJLMKT

Bollen, Kenneth A. 2001. "Cross-National Indicators of Liberal Democracy, 1950-1990." 2nd ICPSR version. Chapel Hill, NC: University of North Carolina, 1998. Ann Arbor, MI: Inter-university Consortium for Political and Social Research, 2001. Original data available at http://webapp.icpsr.umich.edu/cocoon/ICPSR-STUDY/02532.xml.

Bowman, Kirk, Fabrice Lehoucq, and James Mahoney. 2005. Measuring Political Democracy: Case Expertise, Data Adequacy, and Central America. Comparative Political Studies 38 (8): 939-970. http://cps.sagepub.com/content/38/8/939. Data available at http://www.blmdemocracy.gatech.edu/.

Chalmers, R. Philip. 2012. "Mirt: A Multidimensional Item Response Theory Package for the R Environment." 2012 48(6): 1-29. http://www.jstatsoft.org/v48/i06/

Cheibub, Jose Antonio, Jennifer Gandhi, and James Raymond Vreeland. 2010. "Democracy and Dictatorship Revisited." Public Choice. 143(1):67-101. DOI:10.1007/s11127-009-9491-2 Original data available at https://sites.google.com/site/joseantoniocheibub/datasets/democracy-and-dictatorship-revisited.

Coppedge, Michael, John Gerring, Staffan I. Lindberg, Svend-Erik Skaaning, and Jan Teorell, with David Altman, Michael Bernhard, M. Steven Fish, Adam Glynn, Allen Hicken, Carl Henrik Knutsen, Kelly McMann, Pamela Paxton, Daniel Pemstein, Jeffrey Staton, Brigitte Zimmerman, Frida Andersson, Valeriya Mechkova, Farhad Miri. 2015. V-Dem Codebook v5. Varieties of Democracy (V-Dem) Project. Original data available at https://v-dem.net/en/data/.

Coppedge, Michael, Angel Alvarez and Claudia Maldonado. 2008. “Two Persistent Dimensions of Democracy: Contestation and Inclusiveness”. The journal of politics 70(03) (2008), pp. 632-647. DOI: 10.1017/S0022381608080663.

Coppedge, Michael and Wolfgang H. Reinicke. 1991. Measuring Polyarchy. In On Measuring Democracy: Its Consequences and Concomitants, ed. Alex Inkeles. New Brunswuck, NJ: Transaction pp. 47-68.

Doorenspleet, Renske. 2000. Reassessing the Three Waves of Democratization. World Politics 52 (03): 384-406. DOI:10.1017/S0043887100016580.

Economist Intelligence Unit. 2012. Democracy Index 2012: Democracy at a Standstill.

Freedom House. 2015. "Freedom in the World." Original data available at http://www.freedomhouse.org.

Gasiorowski, Mark J. 1996. "An Overview of the Political Regime Change Dataset." Comparative Political Studies 29(4):469-483. DOI:10.1177/0010414096029004004

Geddes, Barbara, Joseph Wright, and Erica Frantz. 2014. Autocratic Breakdown and Regime Transitions: A New Data Set. Perspectives on Politics 12 (1): 313-331. DOI:10.1017/S1537592714000851. Original data available at http://dictators.la.psu.edu/.

Goldstone, Jack, Robert Bates, David Epstein, Ted Gurr, Michael Lustik, Monty Marshall, Jay Ulfelder, and Mark Woodward. 2010. A Global Model for Forecasting Political Instability. American Journal of Political Science 54 (1): 190-208. DOI:10.1111/j.1540-5907.2009.00426.x

Gründler K. and T. Krieger. 2018. "Machine Learning Indices, Political Institutions, and Economic Development". Report. CESifo Group Munich. https://www.cesifo-group.de/DocDL/cesifo1_wp6930.pdf

Gründler K. and T. Krieger. 2016. "Democracy and growth: Evidence from a machine learning indicator". European Journal of Political Economy 45, pp. 85-107. DOI: https://doi.org/10.1016/j.ejpoleco.2016.05.005.

Hadenius, Axel. 1992. Democracy and Development. Cambridge: Cambridge University Press.

Hadenius, Axel & Jan Teorell. 2007. "Pathways from Authoritarianism", Journal of Democracy 18(1): 143-156.

Hsu, Sara "The Effect of Political Regimes on Inequality, 1963-2002," UTIP Working Paper No. 53 (2008), http://utip.gov.utexas.edu/papers/utip_53.pdf. Data available for download at http://utip.gov.utexas.edu/data/.

Kailitz, Steffen. 2013. Classifying political regimes revisited: legitimation and durability. Democratization 20 (1): 39-60. DOI:10.1080/13510347.2013.738861. Original data available at http://dx.doi.org/10.1080/13510347.2013.738861.

Mainwaring, Scott, Daniel Brinks, and Anibal Perez Linan. 2008. "Political Regimes in Latin America, 1900-2007." Original data available from http://kellogg.nd.edu/scottmainwaring/Political_Regimes.pdf.

Magaloni, Beatriz, Jonathan Chu, and Eric Min. 2013. Autocracies of the World, 1950-2012 (Version 1.0). Dataset, Stanford University. Original data and codebook available at http://cddrl.fsi.stanford.edu/research/autocracies_of_the_world_dataset/.

Marshall, Monty G., Ted Robert Gurr, and Keith Jaggers. 2012. "Polity IV: Political Regime Characteristics and Transitions, 1800-2012." Updated to 2015. Original data available from http://www.systemicpeace.org/polity/polity4.htm.

Marquez, Xavier. 2016. "A Quick Method for Extending the Unified Democracy Scores" (March 23, 2016). Available at SSRN: http://ssrn.com/abstract=2753830

Moon, Bruce E., Jennifer Harvey Birdsall, Sylvia Ceisluk, Lauren M. Garlett, Joshua J. Hermias, Elizabeth Mendenhall, Patrick D. Schmid, and Wai Hong Wong (2006) "Voting Counts: Participation in the Measurement of Democracy" Studies in Comparative International Development 42, 2 (Summer, 2006). DOI:10.1007/BF02686309. The complete dataset is available here: http://www.lehigh.edu/~bm05/democracy/Obtain_data.htm.

Munck, Gerardo L. 2009. Measuring Democracy: A Bridge Between Scholarship and Politics. Baltimore: Johns Hopkins University Press.

Pemstein, Daniel, Stephen Meserve, and James Melton. 2010. Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type. Political Analysis 18 (4): 426-449. DOI:10.1093/pan/mpq020. The UD scores are available here: http://www.unified-democracy-scores.org/

Pemstein, Daniel, Stephen A. Meserve, and James Melton. 2013. "Replication data for: Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type." In: Harvard Dataverse. http://hdl.handle.net/1902.1/PMM

Reich, G. 2002. Categorizing Political Regimes: New Data for Old Problems. Democratization 9 (4): 1-24. DOI:10.1080/714000289. http://www.tandfonline.com/doi/pdf/10.1080/714000289.

Skaaning, Svend-Erik, John Gerring, and Henrikas Bartusevičius. 2015. A Lexical Index of Electoral Democracy. Comparative Political Studies 48 (12): 1491-1525. DOI:10.1177/0010414015581050. Original data available from http://thedata.harvard.edu/dvn/dv/skaaning.

Svolik, Milan. 2012. The Politics of Authoritarian Rule. Cambridge and New York: Cambridge University Press. Original data available from http://campuspress.yale.edu/svolik/the-politics-of-authoritarian-rule/.

Taylor, Sean J. and Ulfelder, Jay, A Measurement Error Model of Dichotomous Democracy Status (May 20, 2015). Available at SSRN: http://ssrn.com/abstract=2726962

The Economist Intelligence Unit. 2018. Democracy Index 2017: Free Speech under Attack.

Ulfelder, Jay. 2012. "Democracy/Autocracy Data Set." In: Harvard Dataverse. http://hdl.handle.net/1902.1/18836.

Vanhanen, Tatu. 2012. "FSD1289 Measures of Democracy 1810-2012." Original data available from http://www.fsd.uta.fi/english/data/catalogue/FSD1289/meF1289e.html

Wahman, Michael, Jan Teorell, and Axel Hadenius. 2013. Authoritarian regime types revisited: updated data in comparative perspective. Contemporary Politics 19 (1): 19-34. DOI:10.1080/13569775.2013.773200



xmarquez/QuickUDS documentation built on May 4, 2019, 1:24 p.m.