config_make: Generate config CSV files for new samples

Description Usage Arguments Details Note Author(s) Examples

View source: R/config_make.R

Description

Produces the config files "new_mnemonics.csv" and "new_sample_updates.csv" needed for processing new PMA samples.

Usage

1
config_make(samples, test_with, tts, dds, write = T, open_on_write = F)

Arguments

samples

A comma-separated vector of new samples whose data dictionaries can be loaded with dds_list (see example).

test_with

Optional: a number of mnemonics in each sample to use in a test of the function. If not specified, all mnemonics in all samples will be used.

tts

Optional: A list of translation tables returned by tts_list. If not provided, all translation tables will be loaded automatically.

dds

Optional: A list of data dictionaries returned by dds_list. If not provided, all data dictionaries will be loaded automatically.

write

Logical: defaults TRUE. If FALSE, the two config files will be returned as a list.

open_on_write

Logical: defaults FALSE. If TRUE, the two config files will be opened in Excel.

Details

This function will identify all unique mnemonics associated with the new samples specified by the user. For each mnemonic, it first looks to see if the mnemonic appears in any existing PMA data dictionaries: if it does not find a match, the mnemonic is added to a CSV file called "new_mnemonics.csv"; if it does find matches in older data dictionaries, it then proceeds to identify one "base" sample to be referenced in a CSV file called "new_sample_updates.csv" (see note).

The two CSV files produced by this function will appear in a unique time-stamped folder at "pkg/ipums/pma/admin/config_files".

Note

So-called "base" samples are chosen for "new_sample_updates.csv" as follows: ideally, a base should be the most recent sample from the same country sharing the same unit of analysis as the "new" sample. If this is not possible, the base sample will be the most recent sample from any country containing a matching mnemonic, provided that it shares the same unit of analysis. If a base sample still cannot be found, the function will then look through samples from the same country that have a different unit of analysis. As a last resort, it will select the most recent sample of any kind that shares a matching mnemonic.

When a base sample is identified, the function finds any / all translation tables where its mnemonic appears (referenced by SVAR). A separate record will appear in "new_sample_updates.csv" for each of these translation tables: mnemonics may be used for more than one integrated variable!

Author(s)

Matt Gunther

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Not run: 
# Normal usage:
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh")
)

# Testing with the first 25 unique mnemonics:
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh"),
  test_with = 25
)

# Save time if using repeatedly:
TT <- tts_list()
DD <- dds_list()
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh"),
  test_with = 25,
  tts = TT,
  dds = DD
)

## End(Not run)

mgunther87/ipumsPMA documentation built on Aug. 1, 2020, 12:22 a.m.