adjacency: Derive adjecency matrix from collection of edits In editrules: Parsing, Applying, and Manipulating Data Cleaning Rules

Description

A set of edits can be represented as a graph where every vertex is an edit. Two vertices are connected if they have at least one variable in `vars` in common.

Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26``` ```adjacency(E, nodetype = c("all", "rules", "vars"), rules = rownames(E), vars = getVars(E), ...) ## S3 method for class 'editmatrix' adjacency(E, nodetype = c("all", "rules", "vars"), rules = rownames(E), vars = getVars(E), ...) ## S3 method for class 'editarray' adjacency(E, nodetype = c("all", "rules", "vars"), rules = rownames(E), vars = getVars(E), ...) ## S3 method for class 'editset' adjacency(E, nodetype = c("all", "rules", "vars"), rules = c(rownames(E\$num), rownames(E\$mixcat)), vars = getVars(E), ...) ## S3 method for class 'editmatrix' as.igraph(x, nodetype = c("all", "rules", "vars"), rules = editnames(x), vars = getVars(x), weighted = TRUE, ...) ## S3 method for class 'editarray' as.igraph(x, nodetype = c("all", "rules", "vars"), rules = editnames(x), vars = getVars(x), weighted = TRUE, ...) ## S3 method for class 'editset' as.igraph(x, nodetype = c("all", "rules", "vars"), rules = editnames(x), vars = getVars(x), weighted = TRUE, ...) ```

Arguments

 `E` `editmatrix`, `editarray` or `editset` `nodetype` adjacency between rules, vars or both? `rules` selection of edits `vars` selection of variables `...` arguments to be passed to or from other methods `x` An object of class `editmatrix`, `editarray` or `editset` `weighted` see `graph.adjacency`

Details

`adjacency` returns the adjacency matrix. The elements of the matrix count the number of variables shared by the edits indicated in the row- and column names. The adjacency matrix can be converted to an igraph object with `graph.adjacency`from the `igraph` package.

`as.igraph` converts a set of edits to an `igraph` object directly.

Value

the adjacency matrix of edits in `E` with resect to the variables in `vars`

`plot.editmatrix`, `plot.editarray`, `plot.editset`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94``` ```## Examples with linear (in)equality edits # load predefined edits from package data(edits) edits # convert to editmatrix E <- editmatrix(edits) ## Not run: # (Note to reader: the Not run directive only prevents the examle commands from # running when package is built) # Total edit graph plot(E) # Graph with dependent edits plot(E, nodetype="rules") # Graph with dependent variables plot(E, nodetype="vars") # Total edit graph, but with curved lines (option from igraph package) plot(E, edge.curved=TRUE) # graph, plotting just the connections caused by variable 't' plot(E,vars='t') ## End(Not run) # here's an example with a broken record. r <- c(ct = 100, ch = 30, cp = 70, p=30,t=130 ) violatedEdits(E,r) errorLocalizer(E,r)\$searchBest()\$adapt # we color the violated edits and the variables that have to be adapted ## Not run set.seed(1) # (for reprodicibility) plot(E, adapt=errorLocalizer(E,r)\$searchBest()\$adapt, violated=violatedEdits(E,r)) ## End(Not run) # extract total graph (as igraph object) as.igraph(E) # extract graph with edges related to variable 't' and 'ch' as.igraph(E,vars=c('t','ch')) # extract total adjacency matrix adjacency(E) # extract adjacency matrix related to variables t and 'ch' adjacency(E,vars=c('t','ch')) ## Examples with categorical edits # generate an editarray: E <- editarray(expression( age %in% c('<15','16-65','>65'), employment %in% c('unemployed','employed','retired'), salary %in% c('none','low','medium','high'), if (age == '<15') employment=='unemployed', if (salary != 'none') employment != 'unemployed', if (employment == 'unemployed') salary == 'none')) ## Not run: # plot total edit graph plot(E) # plot with a different layout plot(E,layout=layout.circle) # plot edit graph, just the connections caused by 'salary' plot(E,vars='salary') ## End(Not run) # extract edit graph as.igraph(E) # extract edit graph, just the connections caused by 'salary' as.igraph(E,vars='salary') # extract adjacency matrix adjacency(E) # extract adjacency matrix, only caused by 'employment' adjacency(E,vars='employment') ```