Description Usage Arguments Value
View source: R/clean_snapshot.R
This function takes requested snapshots, import them, clean them, and export them into Rda and/or fst objects for future calls by vrmatch function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | clean_snapshot(
date_df = NULL,
start = "2018-04-26",
end = "2021-01-01",
path = "7z",
pattern = "^(?=.*Cntywd_)(?!.*Hist)",
file_type = ".txt",
path_clean = "clean_df",
clean_prefix = "df_cleaned_",
clean_suffix = "",
save_type = c("rda", "fst"),
format = "%m%d%y",
recursive = FALSE,
period = 1,
file_prefix = "Cntywd_",
varnames = NULL,
date = NULL,
date_order = "mdy",
num = NULL,
first = "szNameFirst",
voter_prefix = "sVoterTitle",
gender = "sGender",
email = "szEmailAddress",
email_exc = c("abc@example.com"),
phone = "szPhone",
phone_exc = "___-____",
...
)
|
date_df |
List of snapshots. Defaults to NULL, in which case the function will detect all snapshots available. |
start |
The start date of the first snapshot. Defaults to April 26, 2018. |
end |
The end date of the last snapshot. Defaults to Jan 1, 2021. |
path |
Path where all snapshots are stored. Defaults to subfolder 7z. |
pattern |
Regular expression of the file pattern to find. Defaults to a particular pattern of OCROV files. |
file_type |
File type. Defaults to .txt. |
path_clean |
Path where cleaned snapshots would be stored. Defaults to "clean_df". |
clean_prefix |
File prefixes for cleaned snapshots. This replaces the existing file prefix. Defaults to "df_cleaned_". |
clean_suffix |
File suffixes for cleaned snapshots. Defaults to empty string. |
save_type |
How to export the cleaned dataframe. Defaults to Rda and fst. |
format |
Format of the date in the snapshot file names. Defaults to "%m%d%y". |
recursive |
Whether to find files recursively. Defaults to FALSE. |
period |
Period/interval between each snapshot— whether daily, weekly, and so on. Defaults to 1 (equivalent to "day"). Any valid input for base seq.Date by argument is allowed. |
file_prefix |
File name prefix. Defaults to Cntywd_. |
varnames |
All variables to be cleaned. Defaults to NULL. |
date |
Date variables. Defaults to NULL. |
date_order |
Order of the date variable, if string format. |
num |
Numeric variables. Defaults to NULL. |
first |
Variable containing first names. Defaults to "szNameFirst". |
voter_prefix |
Variable containing self-reported personal prefixes. Defaults to "sVoterTitle". |
gender |
Variable containing original gender entry. Defaults to "sGender". |
email |
Name of the email address field. Defaults to "szEmailAddress". |
email_exc |
Emails that are to be cleaned. Defaults to a single vector of abc at example.com |
phone |
Name of the phone number field. Defaults to "szPhone". |
phone_exc |
Phone numbers that are to be cleaned. Defaults to "___-____". |
... |
Other arguments to be passed to snapshot_import. |
Output dataframe with cleaned contacts.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.