editrules: Parsing, Applying, and Manipulating Data Cleaning Rules

Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the igraph package.

Install the latest version of this package by entering the following in R:
install.packages("editrules")
AuthorEdwin de Jonge, Mark van der Loo
Date of publication2015-06-11 11:16:28
MaintainerEdwin de Jonge <edwindjonge@gmail.com>
LicenseGPL-3
Version2.9.0
https://github.com/data-cleaning/editrules

View on CRAN

Man pages

adddummies: Add dummy variable to the data.frames, these are needed for...

adjacency: Derive adjecency matrix from collection of edits

as.character.cateditmatrix: Coerce an cateditmatrix to a 'character' vector

as.editmatrix: Coerce a matrix to an edit matrix.

as.editset: Coerce x to an editset

asLevels: Transform a found solution into a categorical record

as.lp.mip: Coerces a 'mip' object into an lpsolve object

as.mip: Write an editset into a mip representation

backtracker: Backtracker: a flexible and generic binary search program

blocks: Decompose a matrix or edits into independent blocks

cateditmatrix: Create an editmatrix with categorical variables

checkDatamodel: Check data against a datamodel

condition: Get condition matrix from an editset.

contains: Determine which edits contain which variable(s)

contains.boolmat: Determine if a boolean matrix contains 'var'

datamodel: Summarize data model of an editarray in a data.frame

disjunct: Decouple a set of conditional edits

duplicated.editarray: Check for duplicate edit rules

duplicated.editmatrix: Check for duplicate edit rules

echelon: Bring an (edit) matrix to reduced row echelon form.

editarray: convert to matrix

editfile: Read edits edits from free-form textfile

editmatrix: convert to matrix

editnames: Names of edits

editrules-package: An overview of the function of package 'editrules'

editrules.plotting: Graphical representation of edits

edits: Example editrules, used in vignette

editset: Read general edits

editType: Determine edittypes in editset based on 'contains(E)'

eliminate: Eliminate a variable from a set of edit rules

errorLocalizer: Create a backtracker object for error localization

errorLocalizer_mip: Localize errors using a MIP approach.

errorLocation: The errorLocation object

expandEdits: Expand an edit expression

fcf.env: Field code forest algorithm

generateEdits: Derive all essentially new implicit edits

getA: Returns the coefficient matrix 'A' of linear (in)equalities

getAb: Returns augmented matrix representation of edit set.

getArr: Get named logical array from editarray

getb: Returns the constant part 'b' of a linear (in)equality

geth: Returns the derivation history of an edit matrix or array

getInd: get index list from editmatrix

getlevels: retrieve level names from editarray

getnames: retrieve edit names from editarray

getOps: Returns the operator part of a linear (in)equality...

getSep: get seprator used to seperate variables from levels in...

getUpperBounds: Get upperbounds of edits, given the boundaries of all...

getVars: get names of variables in a set of edits

getVars.cateditmatrix: Returns the variable names of an (in)equality 'editmatrix' E

getVars.editarray: get variable names in editarray

getVars.editlist: get variable names

getVars.editmatrix: Returns the variable names of an (in)equality 'editmatrix' E

impliedValues: Retrieve values stricktly implied by rules

ind2char: Derive textual representation from (partial) indices

indFromArray: Compute index from array part of editarray

is.editrules: Check object class

isFeasible: Check consistency of set of edits

isNormalized: Check if an editmatrix is normalized

isObviouslyInfeasible: Check for obvious contradictions in a set of edits

isObviouslyRedundant: Find obvious redundancies in set of edits

isSubset: Check which edits are dominated by other ones.

localize: Workhorse function for localizeErrors

localizeErrors: Localize errors on records in a data.frame.

nedits: Number of edits Count the number of edits in a collection of...

neweditarray: editarray: logical array where every column corresponds to...

neweditmatrix: Create an 'editmatrix' object from its constituing...

newerrorlocation: Generate new errorlocation object

normalize: Normalizes an editmatrix

parseCat: Parse a categorical edit expression

parseCatEdit: parse categorial edit

parseEdits: Parse a character vector of edits

parseMix: Parse a mixed edit

parseNum: Parse a numerical edit expression

print.backtracker: print a backtracker

print.cateditmatrix: print cateditmatrix

print.editarray: print editarray

print.editlist: print editset

print.editmatrix: print editmatrix

print.editset: print editset

print.editsummary: summary

print.errorLocation: Print object of class errorLocation

print.locationsummary: summary

print.violatedEdits: Print violatedEdits

reduce: Remove redundant variables and edits.

removeRedundantDummies: Remove redundant dummy variables

separate: Separate an editset into its disconnected blocks and simplify

simplify: Simplify logical mixed edits in an editset

softEdits: Derive editmatrix with soft constraints based on boundaries...

softEdits.cateditmatrix: Derive editmatrix with soft constraints. This is a utility...

softEdits.editarray: Derive editmatrix with soft constraints based on boundaries...

softEdits.editmatrix: Derive editmatrix with soft constraints based on boundaries...

subsetting: Row index operator for 'editmatrix'

substValue: Replace a variable by a value in a set of edits.

violatedEdits: Check data against constraints

writeELAsMip: Rewrite an editset and reported values into the components...

Functions

adddummies Man page
adjacency Man page
adjacency.editarray Man page
adjacency.editmatrix Man page
adjacency.editset Man page
as.character.cateditmatrix Man page
as.character.editarray Man page
as.character.editmatrix Man page
as.character.editset Man page
as.data.frame.editarray Man page
as.data.frame.editmatrix Man page
as.data.frame.editset Man page
as.data.frame.violatedEdits Man page
as.editmatrix Man page
as.editset Man page
as.expression.editarray Man page
as.expression.editmatrix Man page
as.igraph.editarray Man page
as.igraph.editmatrix Man page
as.igraph.editset Man page
asLevels Man page
as.lp.mip Man page
as.matrix.editarray Man page
as.matrix.editmatrix Man page
as.mip Man page
backtracker Man page
blockIndex Man page
blocks Man page
cateditmatrix Man page
[.cateditmatrix Man page
c.editarray Man page
c.editmatrix Man page
c.editset Man page
checkDatamodel Man page
choicepoint Man page
condition Man page
contains Man page
contains.boolmat Man page
contains.cateditmatrix Man page
contains.editarray Man page
contains.editmatrix Man page
contains.editset Man page
contains.matrix Man page
datamodel Man page
disjunct Man page
duplicated.editarray Man page
duplicated.editmatrix Man page
echelon Man page
echelon.editmatrix Man page
echelon.editset Man page
echelon.matrix Man page
editarray Man page
[.editarray Man page
editfile Man page
[.editlist Man page
editmatrix Man page
[.editmatrix Man page
editnames Man page
editrules-package Man page
editrules.plotting Man page
edits Man page
editset Man page
[.editset Man page
editType Man page
eliminate Man page
eliminate.editarray Man page
eliminate.editlist Man page
eliminate.editmatrix Man page
eliminate.editset Man page
errorLocalizer Man page
errorLocalizer.editarray Man page
errorLocalizer.editlist Man page
errorLocalizer.editmatrix Man page
errorLocalizer.editset Man page
errorLocalizer_mip Man page
errorLocation Man page
expandEdits Man page
fcf.env Man page
generateEdits Man page
getA Man page
getAb Man page
getArr Man page
getb Man page
geth Man page
getH Man page
getInd Man page
getlevels Man page
getnames Man page
getOps Man page
getSep Man page
getUpperBounds Man page
getVars Man page
getVars.cateditmatrix Man page
getVars.editarray Man page
getVars.editlist Man page
getVars.editmatrix Man page
getVars.editset Man page
getVars.NULL Man page
impliedValues Man page
impliedValues.editmatrix Man page
ind2char Man page
indFromArray Man page
is.editarray Man page
is.editmatrix Man page
is.editrules Man page
is.editset Man page
isFeasible Man page
isNormalized Man page
isObviouslyInfeasible Man page
isObviouslyInfeasible.editarray Man page
isObviouslyInfeasible.editenv Man page
isObviouslyInfeasible.editlist Man page
isObviouslyInfeasible.editmatrix Man page
isObviouslyInfeasible.editset Man page
isObviouslyRedundant Man page
isObviouslyRedundant.editarray Man page
isObviouslyRedundant.editenv Man page
isObviouslyRedundant.editlist Man page
isObviouslyRedundant.editmatrix Man page
isObviouslyRedundant.editset Man page
isSubset Man page
localize Man page
localizeErrors Man page
nedits Man page
neweditarray Man page
neweditmatrix Man page
newerrorlocation Man page
normalize Man page
parseCat Man page
parseCatEdit Man page
parseEdits Man page
parseMix Man page
parseNum Man page
plot.editarray Man page
plot.editmatrix Man page
plot.editset Man page
plot.errorLocation Man page
plot.violatedEdits Man page
print.backtracker Man page
print.cateditmatrix Man page
print.editarray Man page
print.editlist Man page
print.editmatrix Man page
print.editset Man page
print.editsummary Man page
print.errorLocation Man page
print.locationsummary Man page
print.violatedEdits Man page
reduce Man page
reduce.editarray Man page
reduce.editmatrix Man page
reduce.editset Man page
removeRedundantDummies Man page
separate Man page
simplify Man page
softEdits Man page
softEdits.cateditmatrix Man page
softEdits.editarray Man page
softEdits.editmatrix Man page
str.editmatrix Man page
substValue Man page
substValue.editarray Man page
substValue.editenv Man page
substValue.editlist Man page
substValue.editmatrix Man page
substValue.editset Man page
summary.editarray Man page
summary.editmatrix Man page
summary.editset Man page
summary.errorLocation Man page
summary.violatedEdits Man page
violatedEdits Man page
violatedEdits.character Man page
violatedEdits.editarray Man page
violatedEdits.editmatrix Man page
violatedEdits.editset Man page
writeELAsMip Man page

Files

inst
inst/script
inst/script/edits
inst/script/edits/mixedits.R
inst/script/edits/myedits.txt
inst/script/bench
inst/script/bench/benchmip_mixed.R inst/script/bench/randomEdits.R inst/script/bench/edits.R inst/script/bench/benchmip_categorical.R inst/script/bench/benchAB.R inst/script/bench/benchMIP.R inst/script/bench/eliminator.R inst/script/bench/benchmip_mixed2.R inst/script/bench/benchmip_balance.R
inst/doc
inst/doc/index.html
inst/doc/DeJongeVanderLoo2011-2.pdf
inst/doc/DeJongeVanderLoo2011.pdf
inst/doc/editrules-vignette.Rnw
inst/doc/editrules-vignette.pdf
tests
tests/test_all.R
tests/testthat
tests/testthat/testeditmatrix.R tests/testthat/testlocalizeErrors_mip.R tests/testthat/testCheckDatamodel.R tests/testthat/testViolatedEdits.R tests/testthat/testFourierMotzkin.R tests/testthat/testSubstValue.R tests/testthat/testeditmatrixAttr.R
tests/testthat/edit_test_1.txt
tests/testthat/testParseEdits.R tests/testthat/testIsFeasible.R tests/testthat/testIsObviouslyRedundant.R tests/testthat/testCheck.R tests/testthat/testechelon.R tests/testthat/testgetVars.R tests/testthat/testdatamodel.R tests/testthat/testIsObviouslyInfeasible.R tests/testthat/testErrorLocalizer.R tests/testthat/testEditset.R tests/testthat/testeditarray.R tests/testthat/testDuplicated.R tests/testthat/testBlocks.R tests/testthat/testc.R tests/testthat/testContains.R tests/testthat/testEditRow.R tests/testthat/testLocalizeErrors.R
NAMESPACE
NEWS
data
data/edits.RData
R
R/editrules-data.R R/list2env.R R/echelon.R R/violatedEdits.R R/isSubset.R R/parseMix.R R/subsetting.R R/cateditmatrix.R R/duplicated.R R/errorLocation.R R/editmatrixAttr.R R/perturbWeights.R R/mip.R R/eliminate.R R/plot.R R/pkg.R R/reduce.R R/as.matrix.R R/c.R R/editset.R R/isObviouslyRedundant.R R/contains.R R/editmatrix.R R/disjunct.R R/editAttr.R R/getH.R R/softEdits.R R/parseNum.R R/removeRedundant.R R/plot_errorLocation.R R/editfile.R R/is.R R/checkRows.R R/print.R R/localizeErrors.R R/getUpperBounds.R R/as.igraph.R R/getVars.R R/expandEdits.R R/editarrayAttr.R R/isObviouslyInfeasible.R R/parseCat.R R/checkDatamodel.R R/backtracker.R R/summary.R R/str.R R/errorLocalizer_mip.R R/blocks.R R/isFeasible.R R/writeELAsMip.R R/errorLocalizer.R R/zzz.R R/substValue.R R/parseEdits.R R/adjacency.R R/editarray.R R/generateEdits.R
vignettes
vignettes/editrules-vignette.Rnw
MD5
build
build/vignette.rds
DESCRIPTION
man
man/getSep.Rd man/editfile.Rd man/substValue.Rd man/as.character.cateditmatrix.Rd man/adjacency.Rd man/isFeasible.Rd man/adddummies.Rd man/isObviouslyInfeasible.Rd man/editnames.Rd man/impliedValues.Rd man/as.editset.Rd man/getVars.Rd man/neweditarray.Rd man/subsetting.Rd man/editmatrix.Rd man/nedits.Rd man/cateditmatrix.Rd man/expandEdits.Rd man/print.violatedEdits.Rd man/as.editmatrix.Rd man/errorLocalizer.Rd man/removeRedundantDummies.Rd man/parseEdits.Rd man/parseNum.Rd man/errorLocalizer_mip.Rd man/errorLocation.Rd man/is.editrules.Rd man/edits.Rd man/parseCat.Rd man/getInd.Rd man/datamodel.Rd man/editrules-package.Rd man/parseCatEdit.Rd man/print.editmatrix.Rd man/localize.Rd man/newerrorlocation.Rd man/softEdits.editmatrix.Rd man/fcf.env.Rd man/isNormalized.Rd man/as.lp.mip.Rd man/softEdits.Rd man/asLevels.Rd man/localizeErrors.Rd man/print.locationsummary.Rd man/getArr.Rd man/print.backtracker.Rd man/violatedEdits.Rd man/separate.Rd man/disjunct.Rd man/getVars.cateditmatrix.Rd man/parseMix.Rd man/indFromArray.Rd man/getA.Rd man/generateEdits.Rd man/editrules.plotting.Rd man/checkDatamodel.Rd man/editarray.Rd man/getOps.Rd man/ind2char.Rd man/writeELAsMip.Rd man/neweditmatrix.Rd man/contains.boolmat.Rd man/softEdits.cateditmatrix.Rd man/softEdits.editarray.Rd man/geth.Rd man/getVars.editmatrix.Rd man/eliminate.Rd man/getVars.editlist.Rd man/editset.Rd man/editType.Rd man/blocks.Rd man/print.editsummary.Rd man/getnames.Rd man/print.editset.Rd man/normalize.Rd man/simplify.Rd man/contains.Rd man/print.errorLocation.Rd man/reduce.Rd man/getlevels.Rd man/print.editlist.Rd man/duplicated.editarray.Rd man/duplicated.editmatrix.Rd man/getAb.Rd man/condition.Rd man/as.mip.Rd man/getUpperBounds.Rd man/getb.Rd man/isSubset.Rd man/echelon.Rd man/backtracker.Rd man/getVars.editarray.Rd man/print.cateditmatrix.Rd man/isObviouslyRedundant.Rd man/print.editarray.Rd

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.