fix_match: Fix title match (one source-match pair).

Description Usage Arguments Details Value

Description

Uses a single source-match pair to update values throughout dataframe.

Usage

1
2
fix_match(source_term, match_term, term_df, clean_col, flat_col,
  start_reg = "(?<=^|----)", end_reg = "(?=----|$)", split_reg = "----")

Arguments

source_term

A string containing the flat term that will be retained and used to replace the match term in the target columns.

match_term

A string containing the flat term to be replaced.

term_df

The dataframe to be updated.

clean_col

String with the name of the column "clean" term values.

flat_col

String with the name of the column with "flat" term values.

start_reg,

end_reg Strings providing the regular expressions that can be used to identify the start and and of a term.

split_reg

String with regex identifying the pattern to split on when fields are complex.

Details

This takes a single source-match pair and updates all the "flat" and "clean" fields containing the match term with the source term. This is designed to support interaction with complex fields where the elements are identified by provided regex. In these cases, the matching element of the field will be replaced and the rest of the field left untouched.

NOTE: "flat" refers to the version of the term used during the matching process. "clean" refers to the reader-friendly version of the term.

Value

Returns an updated version of the passed-in dataframe.


datavores/vgsample documentation built on May 14, 2019, 8:59 p.m.