miranska/restore: Regression testing for dataset migration

This is a package of regression testing for dataset migration. Given a scenario where a legacy dataset will be replaced by a target dataset, we will analyze the difference between them based on following tests: 1. Distribution test: Kolmogorov-Smirnov test; 2. Correlation tests: Pearson correlation coefficient and Spearman's correlation; 3. Different variables and records; 4. Magnitude comparison; 5. Mean relative errors; 6. The difference between two hierarchical pairs in Spearman's test; 7. Features that have NA values; 8. Hybrid tests, which shows features that appear in Kolmogorov-Smirnov test, mean relative error test, and correlation tests; 9. Ranking, which shows the ranking of variables that appear in Kolmogorov-Smirnov test, mean relative error test, and correlation tests. The final report will be written into a user-specified xlsx file or/and an object (which is stored in an RData file). Users can choose the test results produced in the final report.

Getting started

Package details

AuthorSean Howard, Krittika Mahajan, Andriy Miranskyy, Tom Montpool, Jessica Moore, Lei Zhang
MaintainerAndriy Miranskyy <avm@ryerson.ca> and Lei Zhang <leizhang@ryerson.ca>
LicenseGPL-3
Version0.1.0
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("miranska/restore")
miranska/restore documentation built on May 8, 2019, 1:21 p.m.