config_make: Generate config CSV files for new samples
In mgunther87/ipumsPMA: Common functions for IPUMS PMA staff

Description Usage Arguments Details Note Author(s) Examples

Produces the config files "new_mnemonics.csv" and "new_sample_updates.csv" needed for processing new PMA samples.

1	config_make(samples, test_with, tts, dds, write = T, open_on_write = F)

`samples`	A comma-separated vector of new samples whose data dictionaries can be loaded with dds_list (see example).
`test_with`	Optional: a number of mnemonics in each sample to use in a test of the function. If not specified, all mnemonics in all samples will be used.
`tts`	Optional: A list of translation tables returned by tts_list. If not provided, all translation tables will be loaded automatically.
`dds`	Optional: A list of data dictionaries returned by dds_list. If not provided, all data dictionaries will be loaded automatically.
`write`	Logical: defaults TRUE. If FALSE, the two config files will be returned as a list.
`open_on_write`	Logical: defaults FALSE. If TRUE, the two config files will be opened in Excel.

This function will identify all unique mnemonics associated with the new samples specified by the user. For each mnemonic, it first looks to see if the mnemonic appears in any existing PMA data dictionaries: if it does not find a match, the mnemonic is added to a CSV file called "new_mnemonics.csv"; if it does find matches in older data dictionaries, it then proceeds to identify one "base" sample to be referenced in a CSV file called "new_sample_updates.csv" (see note).

The two CSV files produced by this function will appear in a unique time-stamped folder at "pkg/ipums/pma/admin/config_files".

So-called "base" samples are chosen for "new_sample_updates.csv" as follows: ideally, a base should be the most recent sample from the same country sharing the same unit of analysis as the "new" sample. If this is not possible, the base sample will be the most recent sample from any country containing a matching mnemonic, provided that it shares the same unit of analysis. If a base sample still cannot be found, the function will then look through samples from the same country that have a different unit of analysis. As a last resort, it will select the most recent sample of any kind that shares a matching mnemonic.

When a base sample is identified, the function finds any / all translation tables where its mnemonic appears (referenced by SVAR). A separate record will appear in "new_sample_updates.csv" for each of these translation tables: mnemonics may be used for more than one integrated variable!

Matt Gunther

## Not run: 
# Normal usage:
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh")
)

# Testing with the first 25 unique mnemonics:
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh"),
  test_with = 25
)

# Save time if using repeatedly:
TT <- tts_list()
DD <- dds_list()
config_make(
  samples = c("bf2017a_nh", "bf2018a_nh"),
  test_with = 25,
  tts = TT,
  dds = DD
)

## End(Not run)