update_data: Update a revised publication table with previously extracted...

Description Usage Arguments Value Examples

Description

Takes a new set of publication records populated and updates with previously recorded (often manually acquired) data.

Usage

1
2
update_data(empty, populated, match_cols, replace_cols, approx_match = FALSE,
  string_dist = 1, min_length = 20, simplify_match = TRUE)

Arguments

empty

The new set of publication records

populated

Previously recorded publication data

match_cols

Column(s) that will be used to match empty and populated

replace_cols

Column(s) to replace in empty

approx_match

Whether to use string distances or exact values when matching records.

string_dist

When using approximate matching, the string distance cutoff at which records will be matched.

min_length

The minimum string length for match_cols at which a record will be considered when matching records.

simplify_match

Whether to perform matching on strings composed from match_cols, but with non alpha-numeric values removed.

Value

An updated version of empty, which will be updated where matches to populated (based on match_cols) are made.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
empty <- data.frame(a = c("Apples", "Oranges", "Bananas"),
                    b = c("Granny", "Florida", "Chiquita"),
                    c = c("", "", ""),
                    d = c("", "", ""),
                    stringsAsFactors = FALSE)

update <- data.frame(a = c("Apples", "Oranges"),
                     b = c("Granny", "Florida"),
                     c = c("Red", ""),
                     d = c("Green", ""),
                     stringsAsFactors = FALSE)

update_data(empty, update, c("a", "b"), min_length = 5)

## End(Not run)

graggsd/sysreviewR documentation built on May 16, 2019, 2:52 a.m.