View source: R/match-statistics.R
match_statistics | R Documentation |
Missing values are converted to a factor level. This explicit assignment can reduce the chances that missing values are inadvertently ignored. It also allows the presence of a missing to become a predictor in models.
match_statistics(d_parent, d_child, join_columns)
d_parent |
A |
d_child |
A |
join_columns |
The |
If a nonexistent column is passed to join_columns
, an error will be thrown naming the violating column name.
More information about the 'parent' and 'child' terminology and concepts can be found in the Hierarchical Database Model Wikipedia entry, among many other sources.
A numeric
array of the following elements:
parent_in_child
The count of parent records found in the child table.
parent_not_in_child
The count of parent records not found in the child table.
parent_na_any
The count of parent records with a NA
in at least one of the join columns.
deadbeat_proportion
The proportion of parent records not found in the child table.
child_in_parent
The count of child records found in the parent table.
child_not_in_parent
The count of child records not found in the parent table.
child_na_any
The proportion of child records not found in the parent table.
orphan_proportion
The count of child records with a NA
in at least one of the join columns.
The join_columns
parameter is passed directly to dplyr::semi_join()
and dplyr::anti_join()
.
Will Beasley
ds_parent <- data.frame(
parent_id = 1L:10L,
letter = rep(letters[1:5], each=2),
index = rep(1:2, times=5),
dv = runif(10),
stringsAsFactors = FALSE
)
ds_child <- data.frame(
child_id = 101:140,
parent_id = c(4, 5, rep(6L:14L, each=4), 15, 16),
letter = rep(letters[3:12], each=4),
index = rep(1:2, each=2, length.out=40),
dv = runif(40),
stringsAsFactors = FALSE
)
#Match on one column:
match_statistics(ds_parent, ds_child, join_columns="parent_id")
#Match on two columns:
match_statistics(ds_parent, ds_child, join_columns=c("letter", "index"))
## Produce better format for humans to read
match_statistics_display(ds_parent, ds_child, join_columns="parent_id")
match_statistics_display(ds_parent, ds_child, join_columns=c("letter", "index"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.