Description Usage Arguments Details Value Examples
View source: R/schemaAlignment.R
Identify similarity between 2 datasets and generate a new one that follows as much as possible the schema available in the first dataset while keeping the info available in the second dataset.
1 | schemaAlignment(x, y)
|
x |
a data frame. |
y |
a similar data frame that might include additions like new rows, new columns or changed column names and/or positions. |
This function will take two data frames as input and it will try to make the second one y
fit into the first one's x
schema. Ideally y
should be an updated messy version of x
.
The function will try to find matches by name first, then by content by running an iterative match search operation.
These second level operations try to find a match between columns in x
and y
.
Only those columns that did not match by name are considered for this second step. If there is a match by content,
the column in data frame y
gets renamed according to data frame x
. Then, all those non matching fields
are silently dropped and columns in data frame y
are ordered as closely as possible to the original column
arrangement in x
. Best if used for small datasets around 10K rows max. Currently there is no option to avoid
dropping non-matching columns.This function is verbose as it tries to provide as much visibility as possible to final
user as to what is being done under the hood. It will output some debugging flags as well as the name of the columns
being dropped from each dataset. Still work in progress.
A new data frame that will include all columns that match 100
1 2 3 4 | ## Not run:
z <- schemaAlignment(x,y)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.