editrules: Parsing, Applying, and Manipulating Data Cleaning Rules

Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the igraph package.

AuthorEdwin de Jonge, Mark van der Loo
Date of publication2015-06-11 11:16:28
MaintainerEdwin de Jonge <edwindjonge@gmail.com>

adddummies: Add dummy variable to the data.frames, these are needed for...

adjacency: Derive adjecency matrix from collection of edits

as.character.cateditmatrix: Coerce an cateditmatrix to a 'character' vector

as.editmatrix: Coerce a matrix to an edit matrix.

as.editset: Coerce x to an editset

asLevels: Transform a found solution into a categorical record

as.lp.mip: Coerces a 'mip' object into an lpsolve object

as.mip: Write an editset into a mip representation

backtracker: Backtracker: a flexible and generic binary search program

blocks: Decompose a matrix or edits into independent blocks

cateditmatrix: Create an editmatrix with categorical variables

checkDatamodel: Check data against a datamodel

condition: Get condition matrix from an editset.

contains: Determine which edits contain which variable(s)

contains.boolmat: Determine if a boolean matrix contains 'var'

datamodel: Summarize data model of an editarray in a data.frame

disjunct: Decouple a set of conditional edits

duplicated.editarray: Check for duplicate edit rules

duplicated.editmatrix: Check for duplicate edit rules

echelon: Bring an (edit) matrix to reduced row echelon form.

editarray: convert to matrix

editfile: Read edits edits from free-form textfile

editmatrix: convert to matrix

editnames: Names of edits

editrules-package: An overview of the function of package 'editrules'

editrules.plotting: Graphical representation of edits

edits: Example editrules, used in vignette

editset: Read general edits

editType: Determine edittypes in editset based on 'contains(E)'

eliminate: Eliminate a variable from a set of edit rules

errorLocalizer: Create a backtracker object for error localization

errorLocalizer_mip: Localize errors using a MIP approach.

errorLocation: The errorLocation object

expandEdits: Expand an edit expression

fcf.env: Field code forest algorithm

generateEdits: Derive all essentially new implicit edits

getA: Returns the coefficient matrix 'A' of linear (in)equalities

getAb: Returns augmented matrix representation of edit set.

getArr: Get named logical array from editarray

getb: Returns the constant part 'b' of a linear (in)equality

geth: Returns the derivation history of an edit matrix or array

getInd: get index list from editmatrix

getlevels: retrieve level names from editarray

getnames: retrieve edit names from editarray

getOps: Returns the operator part of a linear (in)equality...

getSep: get seprator used to seperate variables from levels in...

getUpperBounds: Get upperbounds of edits, given the boundaries of all...

getVars: get names of variables in a set of edits

getVars.cateditmatrix: Returns the variable names of an (in)equality 'editmatrix' E

getVars.editarray: get variable names in editarray

getVars.editlist: get variable names

getVars.editmatrix: Returns the variable names of an (in)equality 'editmatrix' E

impliedValues: Retrieve values stricktly implied by rules

ind2char: Derive textual representation from (partial) indices

indFromArray: Compute index from array part of editarray

is.editrules: Check object class

isFeasible: Check consistency of set of edits

isNormalized: Check if an editmatrix is normalized

isObviouslyInfeasible: Check for obvious contradictions in a set of edits

isObviouslyRedundant: Find obvious redundancies in set of edits

isSubset: Check which edits are dominated by other ones.

localize: Workhorse function for localizeErrors

localizeErrors: Localize errors on records in a data.frame.

nedits: Number of edits Count the number of edits in a collection of...

neweditarray: editarray: logical array where every column corresponds to...

neweditmatrix: Create an 'editmatrix' object from its constituing...

newerrorlocation: Generate new errorlocation object

normalize: Normalizes an editmatrix

parseCat: Parse a categorical edit expression

parseCatEdit: parse categorial edit

parseEdits: Parse a character vector of edits

parseMix: Parse a mixed edit

parseNum: Parse a numerical edit expression

print.backtracker: print a backtracker

print.cateditmatrix: print cateditmatrix

print.editarray: print editarray

print.editlist: print editset

print.editmatrix: print editmatrix

print.editset: print editset

print.editsummary: summary

print.errorLocation: Print object of class errorLocation

print.locationsummary: summary

print.violatedEdits: Print violatedEdits

reduce: Remove redundant variables and edits.

removeRedundantDummies: Remove redundant dummy variables

separate: Separate an editset into its disconnected blocks and simplify

simplify: Simplify logical mixed edits in an editset

softEdits: Derive editmatrix with soft constraints based on boundaries...

softEdits.cateditmatrix: Derive editmatrix with soft constraints. This is a utility...

softEdits.editarray: Derive editmatrix with soft constraints based on boundaries...

softEdits.editmatrix: Derive editmatrix with soft constraints based on boundaries...

subsetting: Row index operator for 'editmatrix'

substValue: Replace a variable by a value in a set of edits.

violatedEdits: Check data against constraints

writeELAsMip: Rewrite an editset and reported values into the components...


inst/script/bench/benchmip_mixed.R inst/script/bench/randomEdits.R inst/script/bench/edits.R inst/script/bench/benchmip_categorical.R inst/script/bench/benchAB.R inst/script/bench/benchMIP.R inst/script/bench/eliminator.R inst/script/bench/benchmip_mixed2.R inst/script/bench/benchmip_balance.R
tests/testthat/testeditmatrix.R tests/testthat/testlocalizeErrors_mip.R tests/testthat/testCheckDatamodel.R tests/testthat/testViolatedEdits.R tests/testthat/testFourierMotzkin.R tests/testthat/testSubstValue.R tests/testthat/testeditmatrixAttr.R
tests/testthat/testParseEdits.R tests/testthat/testIsFeasible.R tests/testthat/testIsObviouslyRedundant.R tests/testthat/testCheck.R tests/testthat/testechelon.R tests/testthat/testgetVars.R tests/testthat/testdatamodel.R tests/testthat/testIsObviouslyInfeasible.R tests/testthat/testErrorLocalizer.R tests/testthat/testEditset.R tests/testthat/testeditarray.R tests/testthat/testDuplicated.R tests/testthat/testBlocks.R tests/testthat/testc.R tests/testthat/testContains.R tests/testthat/testEditRow.R tests/testthat/testLocalizeErrors.R
R/editrules-data.R R/list2env.R R/echelon.R R/violatedEdits.R R/isSubset.R R/parseMix.R R/subsetting.R R/cateditmatrix.R R/duplicated.R R/errorLocation.R R/editmatrixAttr.R R/perturbWeights.R R/mip.R R/eliminate.R R/plot.R R/pkg.R R/reduce.R R/as.matrix.R R/c.R R/editset.R R/isObviouslyRedundant.R R/contains.R R/editmatrix.R R/disjunct.R R/editAttr.R R/getH.R R/softEdits.R R/parseNum.R R/removeRedundant.R R/plot_errorLocation.R R/editfile.R R/is.R R/checkRows.R R/print.R R/localizeErrors.R R/getUpperBounds.R R/as.igraph.R R/getVars.R R/expandEdits.R R/editarrayAttr.R R/isObviouslyInfeasible.R R/parseCat.R R/checkDatamodel.R R/backtracker.R R/summary.R R/str.R R/errorLocalizer_mip.R R/blocks.R R/isFeasible.R R/writeELAsMip.R R/errorLocalizer.R R/zzz.R R/substValue.R R/parseEdits.R R/adjacency.R R/editarray.R R/generateEdits.R
man/getSep.Rd man/editfile.Rd man/substValue.Rd man/as.character.cateditmatrix.Rd man/adjacency.Rd man/isFeasible.Rd man/adddummies.Rd man/isObviouslyInfeasible.Rd man/editnames.Rd man/impliedValues.Rd man/as.editset.Rd man/getVars.Rd man/neweditarray.Rd man/subsetting.Rd man/editmatrix.Rd man/nedits.Rd man/cateditmatrix.Rd man/expandEdits.Rd man/print.violatedEdits.Rd man/as.editmatrix.Rd man/errorLocalizer.Rd man/removeRedundantDummies.Rd man/parseEdits.Rd man/parseNum.Rd man/errorLocalizer_mip.Rd man/errorLocation.Rd man/is.editrules.Rd man/edits.Rd man/parseCat.Rd man/getInd.Rd man/datamodel.Rd man/editrules-package.Rd man/parseCatEdit.Rd man/print.editmatrix.Rd man/localize.Rd man/newerrorlocation.Rd man/softEdits.editmatrix.Rd man/fcf.env.Rd man/isNormalized.Rd man/as.lp.mip.Rd man/softEdits.Rd man/asLevels.Rd man/localizeErrors.Rd man/print.locationsummary.Rd man/getArr.Rd man/print.backtracker.Rd man/violatedEdits.Rd man/separate.Rd man/disjunct.Rd man/getVars.cateditmatrix.Rd man/parseMix.Rd man/indFromArray.Rd man/getA.Rd man/generateEdits.Rd man/editrules.plotting.Rd man/checkDatamodel.Rd man/editarray.Rd man/getOps.Rd man/ind2char.Rd man/writeELAsMip.Rd man/neweditmatrix.Rd man/contains.boolmat.Rd man/softEdits.cateditmatrix.Rd man/softEdits.editarray.Rd man/geth.Rd man/getVars.editmatrix.Rd man/eliminate.Rd man/getVars.editlist.Rd man/editset.Rd man/editType.Rd man/blocks.Rd man/print.editsummary.Rd man/getnames.Rd man/print.editset.Rd man/normalize.Rd man/simplify.Rd man/contains.Rd man/print.errorLocation.Rd man/reduce.Rd man/getlevels.Rd man/print.editlist.Rd man/duplicated.editarray.Rd man/duplicated.editmatrix.Rd man/getAb.Rd man/condition.Rd man/as.mip.Rd man/getUpperBounds.Rd man/getb.Rd man/isSubset.Rd man/echelon.Rd man/backtracker.Rd man/getVars.editarray.Rd man/print.cateditmatrix.Rd man/isObviouslyRedundant.Rd man/print.editarray.Rd

