msf_dict: MSF data dictionaries and dummy datasets

View source: R/msf_dict.R

msf_dictR Documentation

MSF data dictionaries and dummy datasets


These function produces MSF OCA dictionaries based on DHIS2 (for outbreaks) and Kobo (for surveys) data sets defining the data element name, code, short names, types, and key/value pairs for translating the codes into human-readable format.


  name = "MSF-outbreak-dict.xlsx",
  tibble = TRUE,
  compact = TRUE,
  long = TRUE

  name = "MSF-survey-dict.xlsx",
  tibble = TRUE,
  compact = TRUE,
  long = TRUE,
  template = TRUE



Specify which disease you would like to use.

  • msf_dict() supports "AJS", "Cholera", "Measles", "Meningitis"

  • msf_dict_survey() supports "Mortality", "Nutrition", "Vaccination_long" and "Vaccination_short" (only used in surveys if template = TRUE)


the name of the dictionary stored in the package.

  • msf_dict_survey() supports Kobo dictionaries not stored within this package, to use these: specify nameas path to .xlsx file and set the template = False


Return data dictionary as a tidyverse tibble (default is TRUE)


if TRUE (default), then a nested data frame is returned where each row represents a single variable and a nested data frame column called "options", which can be expanded with tidyr::unnest(). This only works if long = TRUE.


If TRUE (default), the returned data dictionary is in long format with each option getting one row. If FALSE, then two data frames are returned, one with variables and the other with content options.

@param template Only used for msf_dict_survey(). If TRUE (default) the returned data dictionary is a generic MSF OCA ERB pre-approved dictionary. If FALSE allows you to read in your own Kobo dictionary by defining a path in name.


(for survey dictionaries): if TRUE read in a generic dictionary based on the MSF OCA ERB pre-approved template. However you can also specify your own dictionary if this differs substantially, by setting template = FALSE and defining a path in name.

See Also

matchmaker::match_df() gen_data() msf_dict_survey()


if (require("dplyr") & require("matchmaker")) {
    # You will often want to use MSF dictionaries to translate codes to human-
    # readable variables. Here, we generate a data set of 20 cases:
    dat <- gen_data(
      dictionary = "Cholera",
      varnames = "data_element_shortname",
      numcases = 20,
      org = "MSF"

    # We want the expanded dictionary, so we will select `compact = FALSE`
    dict <- msf_dict(disease = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE)

    # Now we can use matchmaker to filter the data:
    dat_clean <- matchmaker::match_df(dat, dict,
      from = "option_code",
      to = "option_name",
      by = "data_element_shortname",
      order = "option_order_in_set"

R4EPI/epidict documentation built on Aug. 31, 2022, 5:34 a.m.