Home

/

GitHub

/

WIDworld/wid-r-tool

/

download_wid: Download data from WID.world

download_wid: Download data from WID.world
In WIDworld/wid-r-tool: Download Data from WID.world

View source: R/download-wid.R

download_wid

R Documentation

Download data from WID.world

Description

Downloads data from the World Wealth and Income Database (http://WID.world) into a data.frame. Type vignette("wid-demo") for a detailed presentation.

Usage

download_wid(
  indicators = "all",
  areas = "all",
  years = "all",
  perc = "all",
  ages = "all",
  pop = "all",
  metadata = FALSE,
  include_extrapolations = TRUE,
  verbose = FALSE
)

Arguments

`indicators`	List of six-letter strings, or `"all"`: code names of the indicators in the database. Default is `"all"` for all indicators. See 'Details' for more.
`areas`	List of strings, or `"all"`: area code names of the database. `"XX"` for countries/regions, `"XX-YY"` for subregions. Default is `"all"` for all areas. See 'Details' for more.
`years`	Numerical vector, or `"all"`: years to retrieve. Default is `"all"` for all years.
`perc`	List of strings, or `"all"`: percentiles take the form `"pXX"` or `"pXXpYY"`. Default is `"all"` for all percentiles. See 'Details' for more.
`ages`	Numerical vector, or `"all"`: age category codes in the database. 999 for all ages, 992 for adults. Default is `"all"` for all age categories. See 'Details' for more.
`pop`	List of characters, or `"all"`: type of population. `"t"` for tax units, `"i"` for individuals. Default is `"all"` for all population types. See 'Details' for more.
`metadata`	Should the function fetch metadata too (ie. variable descriptions, sources, methodological notes, etc.)? Default is `FALSE`.
`include_extrapolations`	Should the function return estimates that are the results of extrapolations and interpolations based on limited data? Default is `TRUE`.
`verbose`	Should the function indicate the progress of the request? Default is `FALSE`.

Details

Although all arguments default to "all", you cannot download the entire database by typing download_wid(). The command requires you to specify either some indicators or some areas. To download the entire database, please visit https://wid.world/data/ and choose "download full dataset".

If there is no data matching you selection on WID.world (maybe because you specified an indicator or an area that doesn't exist), the command will return NULL with a warning.

All monetary amounts for countries and country subregions are in constant local currency of the reference year (i.e. the previous year, the database being updated every year around July). Monetary amounts for world regions are in EUR PPP of the reference year. You can access the price index using the indicator inyixx, the PPP exchange rates using xlcusp (USD), xlceup (EUR), xlcyup (CNY), and the market exchange rates using xlcusx (USD), xlceux (EUR), xlcyux (CNY). To check the current reference year, you can look at when the price index is equal to 1.

Shares and wealth/income ratios are given as a fraction of 1. That is, a top 1% share of 20% is given as 0.2. A wealth/income ratio of 300% is given as 3.

The arguments of the command follow a nomenclature specific to WID.world. We provide more details with a few examples below. For the complete up-to-date documentation of the structure of the database, please visit https://wid.world/codes-dictionary.

Indicators

The argument indicators is a vector of 6-letter codes that corresponds to a given series type for a given income or wealth concept. The first letter correspond to the type of series. Some of the most common possibilities include:

one-letter code		description
`a`		average
`s`		share
`t`		threshold
`m`		macroeconomic total
`w`		wealth/income ratio

The next five letters correspond a concept (usually of income and wealth). Some of the most common possibilities include:

five-letter code		description
`ptinc`		pre-tax national income
`pllin`		pre-tax labor income
`pkkin`		pre-tax capital income
`fiinc`		fiscal income
`hweal`		net personal wealth

For example, sfiinc corresponds to the share of fiscal income, ahweal corresponds to average personal wealth. If you don't specify any indicator, it defaults to "all" and downloads all available indicators.

Area codes

All data in WID.world is associated to a given area, which can be a country, a region within a country, an aggregation of countries (eg. a continent), or even the whole world. The argument areas is a vector of codes that specify the areas for which to retrieve data. Countries and world regions are coded using 2-letter ISO codes. Country subregions are coded as XX-YY where XX is the country 2-letter code. If you don't specify any area, it defaults to "all" and downloads data for all available areas.

Years

All data in WID.world correspond to a year. Some series go as far back as the 1800s. The argument years is a vector of integer that specify those years. If you don't specify any year, it defaults to "all" and downloads data for all available years.

Percentiles

The key feature of WID.world is that it provides data on the whole distribution, not just totals and averages. The argument perc is a vector of strings that indicate for which part of the distribution the data should be retrieved. For share and average variables, percentiles correspond to percentile ranges and take the form pXXpYY. For example the top 1% share correspond to p99p100. The top 10% share excluding the top 1% is p90p99. Thresholds associated to the percentile group pXXpYY correspond to the minimal income or wealth level that gets you into the group. For example, the threshold of the percentile group p90p100 or p90p91 correspond to the 90% quantile. Variables with no distributional meaning use the percentile p0p100. If you don't specify any percentile, it defaults to "all" and downloads data for all available parts of the distribution.

Age groups

Data may only concern the population in a certain age group. The argument ages is a vector of age codes that specify which age categories to retrieve. Ages are coded using 3-digit codes. Some of the most common possibilities include:

three-digit code		description
`999`		all ages
`992`		adults, including elderly (20+)
`996`		adults, excluding elderly (20-65)

If you don't specify any age, it defaults to "all" and downloads data for all available age groups.

Population types

The data in WID.world can refer to different types of population (i.e. different statistical units). The argument pop is a vector of population codes. They are coded using one-letter codes. Some of the most common possibilities include:

one-letter code		description
`i`		individuals
`t`		tax units
`j`		equal-split adults (ie. income or wealth divided equally among spouses)

If you don't specify any code, it defaults to "all" and downloads data for all types of population.

Extrapolations/interpolations

Some of the data on WID.world is the result of interpolations (when data is only available for a few years) or extrapolations (when data is not available for the most recent years) that are based on much more limited information that other data points. We include these interpolations/extrapolation by default as a convenience, and also because these values are used to perform regional aggregations. Yet we stress that these estimates, especially at the level of individual countries, can be fragile.

For many purposes, it can be preferable to exclude these data points. For that, use the option include_extrapolations = FALSE.

Value

A data.frame with the following columns:

country: The country or area code.
variable: The variable name, which combine the indicator, the age code and the population code.
percentile: The part of the distribution the value relates to.
year: The year the value relates to.
value: The value of the indicator.

If you specify metadata = TRUE, the data.frame also has the following columns:

countryname: The full name of the country/region.
shortname: A short version of the variable full name in plain english.
shortdes: A description of the type of series.
pop: The population type, in plain english.
age: The age group, in plain english.
source: The source for the data.
method: Methodological notes, if any.
imputation: Type of estimate (when applicable). The imputation field is a short qualitative description of the type of estimate provided, which is strongly related to data quality. For technical details, see the method field and papers cited in source.
quality: Data quality (when applicable). The quality field is a score from 0 to 5 indicating the quality of the data.

Author(s)

Thomas Blanchet

WIDworld/wid-r-tool documentation built on Aug. 27, 2024, 5:10 p.m.

WIDworld/wid-r-tool index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

WIDworld/wid-r-tool
Download Data from WID.world

download_wid: Download data from WID.world
In WIDworld/wid-r-tool: Download Data from WID.world

Download data from WID.world

Description

Usage

Arguments

Details

Indicators

Area codes

Years

Percentiles

Age groups

Population types

Extrapolations/interpolations

Value

Author(s)

Related to download_wid in WIDworld/wid-r-tool...

R Package Documentation

Browse R Packages

We want your feedback!

WIDworld/wid-r-tool Download Data from WID.world

download_wid: Download data from WID.world In WIDworld/wid-r-tool: Download Data from WID.world

Download data from WID.world

Description

Usage

Arguments

Details

Indicators

Area codes

Years

Percentiles

Age groups

Population types

Extrapolations/interpolations

Value

Author(s)

Related to download_wid in WIDworld/wid-r-tool...

R Package Documentation

Browse R Packages

We want your feedback!

WIDworld/wid-r-tool
Download Data from WID.world

download_wid: Download data from WID.world
In WIDworld/wid-r-tool: Download Data from WID.world