cust_dup_identify: Create a new (deduplicated) customer ID

Description Usage Arguments Details See Also

View source: R/finalize.R

Description

Many states require a certain amount of extra customer ID deduping. This function changes cust_id to a deduped version and stores the original value in cust_id_raw. For every customer in which a duplicate(s) is found, the row with the lowest customer ID is used for the output cust_id.

Usage

1

Arguments

cust

input customer table

...

set of variables to be used for deduplication

Details

We could (in theory) implement better customer deduplication using fuzzy matching. This would be more computationally difficult though; would need to limit the potential matches using a preprocessing step. Might also be overkill for our needs here.

See Also

Other finalize production data: cust_dup_demo_plot, cust_dup_demo, cust_dup_pct, cust_dup_pull, cust_dup_year, res_id


southwick-associates/salicprep documentation built on Dec. 18, 2019, 6:45 a.m.