Description Usage Arguments Value Examples
View source: R/05-clean-keywords.R
Once keywords (values) have been unnested, they need a lot of cleaning using the same shiny app used for news desks. At this step your renamed values should be in the "renamed_values" folder. This steps joins in the replacement keywords. You also have to ensure every keyword has one name (or category). Possible names are "subject", "persons", "glocations", "organizations", or "creative_works". These cases are somewhat rare but simplify network analysis. This steps writes a file in the "multi_named_values" folder. Before running the next step, you want to open this file in a spreadsheet and manually choose one name (category) for each value (keyword). Save this file in the "single_named_values" folder as "single_named_keywords.csv".
1 2 3 4 5 6 7 8 | nyt_bind_values_lookups(values_output_folder = "renamed_values")
nyt_clean_keywords(
unnested_df,
values_output_folder = "renamed_values",
multi_names_input_folder = "multi_named_values",
multi_names_output_folder = "single_named_values"
)
|
values_output_folder |
folder to find keyword values post-cleaning |
unnested_df |
output of |
multi_names_input_folder |
folder to find keyword values with more than 1 name (category) |
multi_names_output_folder |
folder to save corrected keyword values csv |
nyt_clean_keywords()
returns an unnested df with
replaced keyword values and writes out a file of all keywords
with more than one name
1 2 3 4 | ## Not run:
consolidated_unnested_df <- nyt_clean_keywords(unnested_df)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.