RLdata500: RLdata500 dataset from the RecordLinkage package

RLdata500R Documentation

RLdata500 dataset from the RecordLinkage package

Description

This data is taken from RecordLinkage R package developed by Murat Sariyar and Andreas Borg. The package is licensed under GPL-3 license.

The RLdata500 table contains artificial personal data. Some records have been duplicated with randomly generated errors. RLdata500 contains fifty duplicates.

Usage

RLdata500

Format

A data.table with 500 records. Each row represents one record, with the following columns:

  • fname_c1 – first name, first component,

  • fname_c2 – first name, second component,

  • lname_c1 – last name, first component,

  • lname_c2 – last name, second component,

  • by – year of birth,

  • bm – month of birth,

  • bd – day of birth,

  • rec_id – record id,

  • ent_id – entity id.

References

Sariyar M., Borg A. (2022). RecordLinkage: Record Linkage Functions for Linking and Deduplicating Data Sets. R package version 0.4-12.4, https://CRAN.R-project.org/package=RecordLinkage

Examples


data("RLdata500")
head(RLdata500)


blocking documentation built on June 18, 2025, 9:16 a.m.