fix_match_set: Fix match (source and all matches).

Description Usage Arguments Details Value

Description

Uses a list of a source and its matches to update values throughout dataframe.

Usage

1
2
fix_match_set(match_set, term_df, clean_col, flat_col,
  start_reg = "(?<=^|----)", end_reg = "(?=----|$)", split_reg = "----")

Arguments

match_set

A list whose name is flat term that will be retained and used to replace the element match terms in the target columns.

term_df

The dataframe to be updated.

clean_col

String with the name of the column "clean" term values.

flat_col

String with the name of the column with "flat" term values.

start_reg,

end_reg Strings providing the regular expressions that can be used to identify the start and and of a term.

split_reg

String with regex identifying the pattern to split on when fields are complex.

Details

This takes a single source term and a list of its matches and updates all the "flat" and "clean" fields containing the match terms with the source term. This is designed to support interaction with complex fields where the elements are identified by provided regex. In these cases, the matching element of the field will be replaced and the rest of the field left untouched.

NOTE: "flat" refers to the version of the term used during the matching process. "clean" refers to the reader-friendly version of the term.

Value

Returns an updated version of the passed-in dataframe.


datavores/vgsample documentation built on May 14, 2019, 8:59 p.m.