knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(ipumsPMA) library(kableExtra) inv <- paste0( py$project_to_path("pma"), "/admin/ODK_files", "/ODK_inventory.csv")%>% read_csv()%>% as_tibble()
PMA enumerator documents contain the full text of each quesntionnaire, plus XML markup tags that are used to identify the text of each question recorded in the DDOC1, DTAG1, JDOC1, and JTAG1 columns of each sample's data dictionary.
We create these documents with the function enum_make
, which
reads an R object created by another function odk_get
. Using the operator %>%
allows us to pass the results of the latter to the former:
odk_get("bf2018a_nh")%>%enum_make()
As a result of this function, a .txt
file will be created in the same location as the ODK file referenced by odk_get
. Now, all that's left is for you to do is:
1) Manually save the file as a .doc
file,
2) In Word, use the IPUMS macro format from tags
to make it pretty (and create numbered XML tags for each question)
3) Change the file name as desired
4) Move the file to the enumerator documents folder
ODK files are Excel files containing the programming logic responsible for rendering the survey on devices used by enumerators in the field. We store them in the PMA admin folder, but this folder has sprouted a number of subfolders as the project has grown. We maintain an inventory of the contents of our ODK subfolders at pma/admin/ODK_files/ODK_inventory.csv
It looks like this:
inv%>% kable("html")%>% kable_styling( bootstrap_options = c("striped", "hover", "condensed"), fixed_thead = T )%>% scroll_box(height = "500px")
Notice that there is a default Path
for every sample, but some samples have additional paths listed in columns to the right. The function odk_get
takes an argument, survey
, that can specify one of these other paths.
For example, suppose we're working with the sample bf2019a_hh
. By default, odk_get
returns the ODK file located in the Path
column. In this case, we're working with a person-level sample (as opposed to, say, a service delivery point sample), so the default Path
points to the household questionnaire:
odk_get(sample = "bf2019a_hh")
We can get the female questionnaire associated with bf2019a_hh
by specifying a different column by name (argument names are shown here for readability, but are not required):
odk_get(sample = "bf2019a_hh", survey = "female")
Some samples contain multiple survey rounds, which are stored in r1
, r2
, and so on. For example, the 2018 MNH survey from Ethiopia:
odk_get(sample = "et2016a_mn", survey = "r1") odk_get(sample = "et2016a_mn", survey = "r2")
New columns can be added to the ODK inventory spreadsheet at any time, and they will become immediately available to odk_get
.
Sometimes, it may be useful reference ODK files in R for reasons other than creating enumerator documents. For example, on the sheet called survey
in each file, there should be a column relevant
that shows code reflecting the universe logic for each question: this is very handy if you're drafting universe statements.
One way to access this information is to use odk_get
to open the file in Excel (saving you the trouble of digging through the ODK file folder):
odk_get("bf2018a_hh", open = T)
But, if you're going through several ODK files all at the same time, dealing with multiple open Excel files can be tedious. Instead, you can use odk_get
to reference the information within R. Access the survey
sheet with the $
operator:
odk <- odk_get("bf2018a_hh") surv <- odk$survey
The result is a tibble, which you can query just like any other dataset. Suppose you want to know the universe for the mnemonic handwashing_place_observations
: use the funtion filter
to find the row for this variable, and then select the column relevant
to see the universe logic:
surv%>% filter(name == "handwashing_place_observations")%>% select(relevant)
Looks like a respondent only received this question if the prior question, handwashing_place_rw
was answered with observed_fixed
or observed_mobile
. If you'd like more explanation on what these values mean, you can usually find it on the choices
sheet: it's the second tibble returned by odk_get
. You'll find a connection between the contents of the type
column on the survey
sheet, and the list_name
column on the choices
sheet:
surv%>% filter(name == "handwashing_place_rw")%>% select(type)
The text "select_one" tells us that one choice could have been selected from a list of choices, and the text "handwash_list" refers to the name of a particular list of chioces on the choices
sheet.
odk$choices%>% filter(list_name == "handwash_list")%>% select(name, label..English)
This shows us more information about the universe, still: instead of writing that the handwashing_place_observations
was given to "any household with either a fixed or mobile place for handwashing was observed", we now see that these options comprise all of the "observed" options: a better choice would be "any household where the interviewer observed a place for handwashing".
odk_get
is most powerful when you'd otherwise find yourself working with multipe open Excel files. Instead, if you're planning to look for a mnemonic in lots of different samples at once (perhaps looking for the universe logic for each sample), try using the map
function to iterate through each of your samples simultaneously:
my_samples <- c("bf2017a_nh", "bf2018a_nh", "ke2017a_nh", "ke2018a_nh") my_odks <- map(my_samples, odk_get)%>% set_names(my_samples) map(my_odks, ~{ .x$survey%>% filter(name == "handwashing_place")%>% select(relevant) })
Here, my_odks
is a list containing the results of odk_get
for each of my 4 samples. Any one of the items can be referenced by name (as in my_odks$bf2017a_nh
) and also sub-referenced (as in my_odks$bf2017a_nh$survey
). Instead of doing that, I use map
a second time to apply a lambda function to each of the 4 members of my_odks
: the lambda function fits within the brackets ~{}
, and passes the name of each list item as .x
. Oterwise, the filter & selection process works the same as before.
map
returns the result of each lambda function in a handy list. Notice that the mnemonic handwashing_place
only appears verbatim in 2 samples, but they both have the same universe logic. (FYI: it is possible that the 2018 samples have the same question, but used a slightly different name.)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.