wb_data: Download Data from the World Bank API

Description Usage Arguments Details Value Examples

View source: R/wb_data.R

Description

This function downloads the requested information using the World Bank API

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
wb_data(
  indicator,
  country = "countries_only",
  start_date,
  end_date,
  return_wide = TRUE,
  mrv,
  mrnev,
  cache,
  freq,
  gapfill = FALSE,
  date_as_class_date = FALSE,
  lang
)

Arguments

indicator

Character vector of indicator codes. These codes correspond to the indicator_id column from the indicators tibble of wb_cache(), wb_cachelist, or the result of running wb_indicators() directly

country

Character vector of country, region, or special value codes for the locations you want to return data for. Permissible values can be found in the countries tibble in wb_cachelist or by running wb_countries() directly. Specifically, values listed in the following fields iso3c, iso2c, country, region, admin_region, income_level and all of the region_*, admin_region_*, income_level_*, columns. As well as the following special values

  • "countries_only" (Default)

  • "regions_only"

  • "admin_regions_only"

  • "income_levels_only"

  • "aggregates_only"

  • "all"

start_date

Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1". This also accepts a special value of "YTD", useful for more frequently updated subannual indicators.

end_date

Numeric or character. If numeric it must be in %Y form (i.e. four digit year). For data at the subannual granularity the API supports a format as follows: for monthly data, "2016M01" and for quarterly data, "2016Q1".

return_wide

Logical. If TRUE data is returned in a wide format instead of long, with a column named for each indicator_id or if the indicator argument is a named vector, the names() given to the indicator will be the column names. To necessitate this transformation, the indicator column that provides the human readable description is dropped, but provided as a column label. Default is TRUE

mrv

Numeric. The number of Most Recent Values to return. A replacement of start_date and end_date, this number represents the number of observations you which to return starting from the most recent date of collection. This may include missing values. Useful in conjuction with freq

mrnev

Numeric. The number of Most Recent Non Empty Values to return. A replacement of start_date and end_date, similar in behavior as mrv but excludes locations with missing values. Useful in conjuction with freq

cache

List of tibbles returned from wb_cache(). If omitted, wb_cachelist is used

freq

Character String. For fetching quarterly ("Q"), monthly("M") or yearly ("Y") values. Useful for querying high frequency data.

gapfill

Logical. If TRUE fills in missing values by carrying forward the last available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

date_as_class_date

Logical. If TRUE the date field is returned as class Date, useful when working with non-annual data or data at mixed resolutions. Default is FALSE available value until the next available period (max number of periods back tracked will be limited by mrv number). Default is FALSE

lang

Language in which to return the results. If lang is unspecified, english is the default. For supported languages see wb_languages(). Possible values of lang are in the iso2 column. A note of warning, not all data returns have support for langauges other than english. If the specific return does not support your requested language by default it will return NA.

Details

obs_status column

Indicates the observation status for location, indicator and date combination. For example "F" in the response indicates that the observation status for that data point is "forecast".

Value

a tibble of all available requested data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# gdp for all countries for all available dates
df_gdp <- wb_data("NY.GDP.MKTP.CD")

# Brazilian gdp for all available dates
df_brazil <- wb_data("NY.GDP.MKTP.CD", country = "br")

# Brazilian gdp for 2006

df_brazil_1 <- wb_data("NY.GDP.MKTP.CD", country = "brazil", start_date = 2006)


# Brazilian gdp for 2006-2010

df_brazil_2 <- wb_data("NY.GDP.MKTP.CD", country = "BRA",
                       start_date = 2006, end_date = 2010)


# Population, GDP, Unemployment Rate, Birth Rate (per 1000 people)

my_indicators <- c("SP.POP.TOTL",
                   "NY.GDP.MKTP.CD",
                   "SL.UEM.TOTL.ZS",
                   "SP.DYN.CBRT.IN")


df <- wb_data(my_indicators)

# you pass multiple country ids of different types
# Albania (iso2c), Georgia (iso3c), and Mongolia

my_countries <- c("AL", "Geo", "mongolia")
df <- wb_data(my_indicators, country = my_countries,
              start_date = 2005, end_date = 2007)


# same data as above, but in long format

df_long <- wb_data(my_indicators, country = my_countries,
                   start_date = 2005, end_date = 2007,
                   return_wide = FALSE)


# regional population totals
# regions correspond to the region column in wb_cachelist$countries

df_region <- wb_data("SP.POP.TOTL", country = "regions_only",
                     start_date = 2010, end_date = 2014)


# a specific region

df_world <- wb_data("SP.POP.TOTL", country = "world",
                    start_date = 2010, end_date = 2014)


# if the indicator is part of a named vector the name will be the column name
my_indicators <- c("pop" = "SP.POP.TOTL",
                   "gdp" = "NY.GDP.MKTP.CD",
                   "unemployment_rate" = "SL.UEM.TOTL.ZS",
                   "birth_rate" = "SP.DYN.CBRT.IN")

df_names <- wb_data(my_indicators, country = "world",
                    start_date = 2010, end_date = 2014)


# custom names are ignored if returning in long format

df_names_long <- wb_data(my_indicators, country = "world",
                         start_date = 2010, end_date = 2014,
                         return_wide = FALSE)


# same as above but in Bulgarian
# note that not all indicators have translations for all languages

df_names_long_bg <- wb_data(my_indicators, country = "world",
                            start_date = 2010, end_date = 2014,
                            return_wide = FALSE, lang = "bg")

wbstats documentation built on Jan. 13, 2021, 12:19 p.m.