steptile: stepwise reordering In extracat: Categorical Data Analysis and Visualization

Description

Starts with the first `k+1` variables and applies optile to the corresponding subtable. Then one additional variable at a time is reordered using the subtable defined by this variable and the last `k` variables. Only the current variable is reordered and the others are fixed since they have been reordered in the previous steps.

Usage

 `1` ```steptile(x, k = 1, cpcp = FALSE, ...) ```

Arguments

 `x` The `data.frame` (which is better for high-dimensional data) or data table. `k` The number of preceding variables used for the reordering. E.g. if `k = 3` then variable `6` is reordered using the variables `3, 4, 5, 6`. `cpcp` If `TRUE` a special version of the algorithm which minimizes crossings in CPCP plots (e.g. scpcp ) is used. This modification works with aggregations of the last `k` variables and is much faster than the standard procedure if `k > 1`. `...` Arguments passed to optile.

Details

The optile function also offers stepwise reordering via the argument `method = "sw"` but always starts with the first pair of variables and then considers the complete past: for the reordering of variable `i` all variables `1...(i-1)` are considered. The stepwise algorithms are applicable to high-dimensional problems with a large number of variables where the multivariate techniques fail. Even if `k` is high (i.e. the subtables are also high-dimensional) the procedure is very fast since it ca use the following trick: instead of applying optile to the multidimensional table it is applied to a 2D-table with one dimension defined by the variable that is reordered and the other dimension defined by the (ordered) combinations of all other variables. This way only combinations which appear at least once in the dataset matter and all empty entries (the majority in high-dimensional tables) can be left aside. The maximum possible size of such a table is therefore N * max(n_i) when N is the number of observations and n_i is the number of categories in dimension `i`.

Value

The reordered data either as a `table` or `data.frame` depending on the input type.

Author(s)

Alexander Pilhoefer

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38``` ```## Not run: # scaled numeric variables from the olives data # and 20 k-means solutions so <- scale(olives[,3:10]) rr <- replicate(20,{ kmeans(so,8)\$cluster }) # par(mfrow=c(3,1)) # initial cluster orders x <- as.data.frame(cbind(olives[,1:2],rr)) require(scales) scpcp(x, sel = "data[,1]", sel.palette="rgb", col.opt=list(alpha=0.5)) # reordering using steptile. # optile does not work for the complete table since it has 9*3*2^60 > 3E19 entries # colors by the first unordered example: x2 <- steptile(x, k = 4) scpcp(x2, sel = "match(data[,1],levels(.GlobalEnv\$x[,1]))", sel.palette="rgb", col.opt=list(alpha=0.5)) # additionally reordering the variables ... cmat takes about 20-30 seconds CM <- cmat(x[,3:22]) require(seriation) sM <- get_order(seriate(1-CM)) x3 <- steptile(x2[, c(1,2,2+sM,23)], k = 4) scpcp(x3, sel = "match(data[,1],levels(.GlobalEnv\$x[,1]))", sel.palette="rgb", col.opt=list(alpha=0.5)) ## End(Not run) ```