extended_uds: Extended UDS

A dataset extending the Unified Democracy Scores of Pemstein, Meserve, and Melton (2010) to the 19th century (and sometimes before), updating it with 2013-2016 data, and calculating scores for countries not in the official UD release. Cite both Pemstein, Meserve, and Melton 2010 as well as Marquez 2016.




An object of class tbl_df (inherits from tbl, data.frame) with 27133 rows and 19 columns.


This dataset contains an extension of the Unified Democracy Scores of Pemstein, Meserve, and Melton (2010). PMM 2010 use a latent variable approach to combine diverse measurements of democracy for a broad panel of countries from 1946 to 2012; this dataset extends the scores by using more democracy measures with broader temporal and spatial coverage, generating democracy scores for 26,258 country-years (248 distinct countries and territories), including 18,345 country-years for independent, sovereign countries from 1816 to 2016.

The measurement of democracy is complicated and controversial. The extended UDS makes few judgments about what measures of democracy should be used to generate latent democracy scores, as long as they have been used in scholarly work. It uses dichotomous, trichotomous, ordinal, and continuous indices; indices that focus primarily on the "competition" and indices that focus on the "participation" dimension of democracy; "thick" indices that attempt to measure a wide variety of characteristics plausibly attributed to democracy; and "minimalist" indexes that restrict themselves to the bare minimum of competition. Nevertheless, most of these indexes agree that democracy has something to do with competition and participation, even if they weight these dimensions somewhat differently, and even if they include other things, such as civil rights.

The latent variable score is calculated using the following variables in the democracy dataset: "pmm_arat", "blm", "bmr_democracy", "bnr_extended", "pmm_bollen", "doorenspleet", "wgi_democracy","fh_total_reversed", "fh_electoral", "gwf_democracy_extended", "pmm_hadenius", "kailitz_tri", "lexical_index", "mainwaring", "magaloni_democracy_extended", "pmm_munck", "pacl", "PEPS1v", "pitf", "polity2", "reign_democracy", "polyarchy_original_contestation", "prc", "svolik_democracy", "ulfelder_democracy_extended", "utip_dichotomous_strict", "v2x_polyarchy", "vanhanen_democratization", "wth_democ1". Please refer to the documentation of democracy for more details on these measures of democracy, and to Marquez 2016 for more details on the construction of the extended latent variable model used to generate the scores.

Standard descriptive variables (generated by this package)


The name of the country in the Gleditsch-Ward system of states, or the official name of the entity (for non-sovereign entities and states not in the Gleditsch and Ward system of states) or else a common name for disputed cases that do not have an official name (e.g., Western Sahara, Hyderabad). The Gleditsch and Ward scheme sometimes indicates the common name of the country and (in parentheses) the name of an earlier incarnation of the state: thus, they have Germany (Prussia), Russia (Soviet Union), Madagascar (Malagasy), etc. For details, see Gleditsch, Kristian S. & Michael D. Ward. 1999. "Interstate System Membership: A Revised List of the Independent States since 1816." International Interactions 25: 393-413. The list can be found at http://privatewww.essex.ac.uk/~ksg/statelist.html.


Gleditsch and Ward's numeric country code, from the Gleditsch and Ward list of independent states.


The Correlates of War numeric country code, 2016 version. This differs from Gleditsch and Ward's numeric country code in a few cases. See http://www.correlatesofwar.org/data-sets/state-system-membership for the full list.


The Polity IV country code, 2017 version. This differs from Gleditsch and Ward's numeric country code and COW in a few cases. See http://www.systemicpeace.org/inscrdata.html for the full list.


Whether the state is "in system" (that is, is independent and sovereign), according to Gleditsch and Ward, for this particular date. Matches at the end of the year; so, for example South Vietnam 1975 is FALSE because, according to Gleditsch and Ward, the country ended on April 1975 (being absorbed by North Vietnam). It is also TRUE for dates beyond 2012 for countries that did not end by then, depsite the fact that the Gleditsch and Ward list has not been updated since.


The calendar year. Democracy measurements conventionally reflect the situation in the country as of the last day of the year.

Latent variable scores


Raw latent variable score for the model, with mean (approximately) 0 and sd (approximately) 1.


Standard errors of raw score.

z1_pct975, z1_pct025

95 percent onfidence intervals.


Latent variable score for the model, adjusted to match the cutpoint for the dichotomous indexes included in the model.

z1_adj_pct975, z1_adj_pct025

95 percent confidence intervals of the adjusted score.

z1_as_prob, z1_adj_as_prob

Latent variable scores for the model, unadjusted and adjusted, converted to 0-1 probability index (0 = lowest probability of dmeocracy, 1 = highest probability).

z1_pct975_as_prob, z1_pct025_as_prob, z1_adj_pct975_as_prob, z1_adj_pct025_as_prob

95 percent onfidence intervals of the probability score.


Xavier Marquez, Political Science and International Relations Programme, Victoria University of Wellington, [email protected]


Marquez, Xavier. 2016. A Quick Method for Extending the Unified Democracy Scores (March 23, 2016). Available at SSRN: http://ssrn.com/abstract=2753830

Pemstein, Daniel, Stephen Meserve, and James Melton. 2010. Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type. Political Analysis 18 (4): 426-449. DOI:10.1093/pan/mpq020

