knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
options(width = 110)

library(fetchdhs)
library(tidyverse)

Travis-CI Build Status

fetchdhs

The Demographic and Health Surveys (DHS) Program has conducted more than 400 surveys in over 90 countries since 1984. It remains a critical resource in global health research and analytics as it provides nationally representative data on fertility, family planning, maternal and child health, gender, HIV/AIDS, malaria, and nutrition. The objective of this package is to enable R users to plug into the DHS API and retrieve tidy survey data.

Installation

# install.packages("devtools")
devtools::install_github("murphy-xq/fetchdhs")

Basic example

Let's say you quickly need DHS survey data for:

First, use fetch_countries() to locate the 2-letter DHS country codes for India and Nigeria needed for the api call

fetch_countries() %>% 
  filter(country_name %in% c("India", "Nigeria"))

Next, make use of DHS API tags that categorize survey indicators by topic. In this example, we are looking for all immunization-related indicators using fetch_tags() and identify tag 32

fetch_tags() %>% 
  filter(str_detect(tag_name,  "[Ii]mmunization"))

Finally, use fetch_data() to call the DHS API using the parameters just identified and receive a tidy dataframe as well as the api call:

fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017)

For specific indicators, we can peek at a dataframe of all available indicators to identify which indicator_id codes should be included with fetch_data(). Let's try pulling only DPT3 and Measles indicators:

fetch_indicators() %>% 
  filter(str_detect(definition, "Measles|DPT3"))

Upon investigating the DPT3 and Measles indicators and their associated attributes, we see that we need to use CH_VACC_C_DP3 and CH_VACC_C_MSL:

fetch_data(countries = c("IA","NG"), indicators = c("CH_VACC_C_DP3", "CH_VACC_C_MSL"), years = 2000:2017)

Additional features

Level of disaggregation

We have been using the default level of disaggregation which returns national-level data only. In order to pull subnational, background characteristic, or all available data, we need to specify the breakdown_level in fetch_data()

# national (default)
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "national")

# subnational
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "subnational")

# background
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "background")

# all
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "all")

Return fields

Return fields are the various dimensions of survey data that can be returned. set_return_fields() allows the user to specify which fields should comprise the dataframe returned from the api

set_return_fields(c("Indicator", "CountryName", "SurveyYear", "SurveyType", "Value"))

fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017)

Add geospatial data

We can also include polygon coordinates with our call with add_geometry

fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, add_geometry = TRUE)

API Key

Authenticated users can query more records per page -- 5,000 versus 1,000 maximum records per page. Please see here for authentication details.

Users can input their api key with set_api_key() for inclusion in any subsequent fetch_data() calls.

set_api_key("YOURKEY-GOESHERE")


murphy-xq/fetchdhs documentation built on May 14, 2019, 8:02 a.m.