# Transformations: Functions for Data Transformation In coin: Conditional Inference Procedures in a Permutation Test Framework

## Description

Transformations for factors and numeric variables.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30``` ```id_trafo(x) rank_trafo(x, ties.method = c("mid-ranks", "random")) normal_trafo(x, ties.method = c("mid-ranks", "average-scores")) median_trafo(x, mid.score = c("0", "0.5", "1")) savage_trafo(x, ties.method = c("mid-ranks", "average-scores")) consal_trafo(x, ties.method = c("mid-ranks", "average-scores"), a = 5) koziol_trafo(x, ties.method = c("mid-ranks", "average-scores"), j = 1) klotz_trafo(x, ties.method = c("mid-ranks", "average-scores")) mood_trafo(x, ties.method = c("mid-ranks", "average-scores")) ansari_trafo(x, ties.method = c("mid-ranks", "average-scores")) fligner_trafo(x, ties.method = c("mid-ranks", "average-scores")) logrank_trafo(x, ties.method = c("mid-ranks", "Hothorn-Lausen", "average-scores"), weight = logrank_weight, ...) logrank_weight(time, n.risk, n.event, type = c("logrank", "Gehan-Breslow", "Tarone-Ware", "Peto-Peto", "Prentice", "Prentice-Marek", "Andersen-Borgan-Gill-Keiding", "Fleming-Harrington", "Gaugler-Kim-Liao", "Self"), rho = NULL, gamma = NULL) f_trafo(x) of_trafo(x, scores = NULL) zheng_trafo(x, increment = 0.1) maxstat_trafo(x, minprob = 0.1, maxprob = 1 - minprob) fmaxstat_trafo(x, minprob = 0.1, maxprob = 1 - minprob) ofmaxstat_trafo(x, minprob = 0.1, maxprob = 1 - minprob) trafo(data, numeric_trafo = id_trafo, factor_trafo = f_trafo, ordered_trafo = of_trafo, surv_trafo = logrank_trafo, var_trafo = NULL, block = NULL) mcp_trafo(...) ```

## Arguments

 `x` an object of class `"numeric"`, `"factor"`, `"ordered"` or `"Surv"`. `ties.method` a character, the method used to handle ties. The score generating function either uses the mid-ranks (`"mid-ranks"`, default) or, in the case of `rank_trafo`, randomly broken ties (`"random"`). Alternatively, the average of the scores resulting from applying the score generating function to randomly broken ties are used (`"average-scores"`). See `logrank_test` for a detailed description of the methods used in `logrank_trafo`. `mid.score` a character, the score assigned to observations exactly equal to the median: either 0 (`"0"`, default), 0.5 (`"0.5"`) or 1 (`"1"`); see `median_test`. `a` a numeric vector, the values taken as the constant a in the Conover-Salsburg scores. Defaults to `5`. `j` a numeric, the value taken as the constant j in the Koziol-Nemec scores. Defaults to `1`. `weight` a function where the first three arguments must correspond to `time`, `n.risk`, and `n.event` given below. Defaults to `logrank_weight`. `time` a numeric vector, the ordered distinct time points. `n.risk` a numeric vector, the number of subjects at risk at each time point specified in `time`. `n.event` a numeric vector, the number of events at each time point specified in `time`. `type` a character, one of `"logrank"` (default), `"Gehan-Breslow"`, `"Tarone-Ware"`, `"Peto-Peto"`, `"Prentice"`, `"Prentice-Marek"`, `"Andersen-Borgan-Gill-Keiding"`, `"Fleming-Harrington"`, `"Gaugler-Kim-Liao"` or `"Self"`; see `logrank_test`. `rho` a numeric vector, the ρ constant when `type` is `"Tarone-Ware"`, `"Fleming-Harrington"`, `"Gaugler-Kim-Liao"` or `"Self"`; see `logrank_test`. Defaults to `NULL`, implying `0.5` for `type = "Tarone-Ware"` and `0` otherwise. `gamma` a numeric vector, the γ constant when `type` is `"Fleming-Harrington"`, `"Gaugler-Kim-Liao"` or `"Self"`; see `logrank_test`. Defaults to `NULL`, implying `0`. `scores` a numeric vector or list, the scores corresponding to each level of an ordered factor. Defaults to `NULL`, implying `1:nlevels(x)`. `increment` a numeric, the score increment between the order-restricted sets of scores. A fraction greater than 0, but smaller than or equal to 1. Defaults to `0.1`. `minprob` a numeric, a fraction between 0 and 0.5; see `maxstat_test`. Defaults to `0.1`. `maxprob` a numeric, a fraction between 0.5 and 1; see `maxstat_test`. Defaults to `1 - minprob`. `data` an object of class `"data.frame"`. `numeric_trafo` a function to be applied to elements of class `"numeric"` in `data`, returning a matrix with `nrow(data)` rows and an arbitrary number of columns. Defaults to `id_trafo`. `factor_trafo` a function to be applied to elements of class `"factor"` in `data`, returning a matrix with `nrow(data)` rows and an arbitrary number of columns. Defaults to `f_trafo`. `ordered_trafo` a function to be applied to elements of class `"ordered"` in `data`, returning a matrix with `nrow(data)` rows and an arbitrary number of columns. Defaults to `of_trafo`. `surv_trafo` a function to be applied to elements of class `"Surv"` in `data`, returning a matrix with `nrow(data)` rows and an arbitrary number of columns. Defaults to `logrank_trafo`. `var_trafo` an optional named list of functions to be applied to the corresponding variables in `data`. Defaults to `NULL`. `block` an optional factor whose levels are interpreted as blocks. `trafo` is applied to each level of `block` separately. Defaults to `NULL`. `...` `logrank_trafo()`: further arguments to be passed to `weight`. `mcp_trafo()`: factor name and contrast matrix (as matrix or character) in a tag = value format for multiple comparisons based on a single unordered factor; see `mcp` in package multcomp.

## Details

The utility functions documented here are used to define specialized test procedures.

`id_trafo` is the identity transformation.

`rank_trafo`, `normal_trafo`, `median_trafo`, `savage_trafo`, `consal_trafo` and `koziol_trafo` compute rank scores, normal scores, median scores, Savage scores, Conover-Salsburg scores (see `neuropathy`) and Koziol-Nemec scores, respectively, for location problems.

`klotz_trafo`, `mood_trafo`, `ansari_trafo` and `fligner_trafo` compute Klotz scores, Mood scores, Ansari-Bradley scores and Fligner-Killeen scores, respectively, for scale problems.

`logrank_trafo` computes weighted logrank scores for right-censored data, allowing for a user-defined weight function through the `weight` argument (see `GTSG`).

`f_trafo` computes dummy matrices for factors and `of_trafo` assigns scores to ordered factors. For ordered factors with two levels, the scores are normalized to the [0, 1] range. `zheng_trafo` computes a finite collection of order-restricted scores for ordered factors (see `jobsatisfaction`, `malformations` and `vision`).

`maxstat_trafo`, `fmaxstat_trafo` and `ofmaxstat_trafo` compute scores for cutpoint problems (see `maxstat_test`).

`trafo` applies its arguments to the elements of `data` according to the classes of the elements. A `trafo` function with modified default arguments is usually supplied to `independence_test` via the `xtrafo` or `ytrafo` arguments. Fine tuning, i.e., different transformations for different variables, is possible by supplying a named list of functions to the `var_trafo` argument.

`mcp_trafo` computes contrast matrices for factors.

## Value

A numeric vector or matrix with `nrow(x)` rows and an arbitrary number of columns. For `trafo`, a named matrix with `nrow(data)` rows and an arbitrary number of columns.

## Note

Starting with coin version 1.1-0, all transformation functions are now passing through missing values (i.e., `NA`s). Furthermore, `median_trafo` and `logrank_trafo` are now increasing functions (in conformity with most other transformations in this package).

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41``` ```## Dummy matrix, two-sample problem (only one column) f_trafo(gl(2, 3)) ## Dummy matrix, K-sample problem (K columns) x <- gl(3, 2) f_trafo(x) ## Score matrix ox <- as.ordered(x) of_trafo(ox) of_trafo(ox, scores = c(1, 3:4)) of_trafo(ox, scores = list(s1 = 1:3, s2 = c(1, 3:4))) zheng_trafo(ox, increment = 1/3) ## Normal scores y <- runif(6) normal_trafo(y) ## All together now trafo(data.frame(x = x, ox = ox, y = y), numeric_trafo = normal_trafo) ## The same, but allows for fine-tuning trafo(data.frame(x = x, ox = ox, y = y), var_trafo = list(y = normal_trafo)) ## Transformations for maximally selected statistics maxstat_trafo(y) fmaxstat_trafo(x) ofmaxstat_trafo(ox) ## Apply transformation blockwise (as in the Friedman test) trafo(data.frame(y = 1:20), numeric_trafo = rank_trafo, block = gl(4, 5)) ## Multiple comparisons dta <- data.frame(x) mcp_trafo(x = "Tukey")(dta) ## The same, but useful when specific contrasts are desired K <- rbind("2 - 1" = c(-1, 1, 0), "3 - 1" = c(-1, 0, 1), "3 - 2" = c( 0, -1, 1)) mcp_trafo(x = K)(dta) ```

### Example output

```Loading required package: survival
1
1 1
2 1
3 1
4 0
5 0
6 0
1 2 3
1 1 0 0
2 1 0 0
3 0 1 0
4 0 1 0
5 0 0 1
6 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$x
[1] "contr.treatment"

[,1]
1    1
2    1
3    2
4    2
5    3
6    3
[,1]
1    1
2    1
3    3
4    3
5    4
6    4
s1 s2
1  1  1
2  1  1
3  2  3
4  2  3
5  3  4
6  3  4
gamma = (0.0000, 0.0000, 1.0000) gamma = (0.0000, 0.3333, 1.0000)
1                                0                        0.0000000
2                                0                        0.0000000
3                                0                        0.3333333
4                                0                        0.3333333
5                                1                        1.0000000
6                                1                        1.0000000
gamma = (0.0000, 0.6667, 1.0000) gamma = (0.0000, 1.0000, 1.0000)
1                        0.0000000                                0
2                        0.0000000                                0
3                        0.6666667                                1
4                        0.6666667                                1
5                        1.0000000                                1
6                        1.0000000                                1
[1] -0.1800124  0.1800124 -1.0675705  0.5659488 -0.5659488  1.0675705
x.1 x.2 x.3 ox          y
1   1   0   0  1 -0.1800124
2   1   0   0  1  0.1800124
3   0   1   0  2 -1.0675705
4   0   1   0  2  0.5659488
5   0   0   1  3 -0.5659488
6   0   0   1  3  1.0675705
attr(,"assign")
[1] 1 1 1 2 3
x.1 x.2 x.3 ox          y
1   1   0   0  1 -0.1800124
2   1   0   0  1  0.1800124
3   0   1   0  2 -1.0675705
4   0   1   0  2  0.5659488
5   0   0   1  3 -0.5659488
6   0   0   1  3  1.0675705
attr(,"assign")
[1] 1 1 1 2 3
x <= 0.611 x <= 0.707 x <= 0.73 x <= 0.744 x <= 0.808
1          0          0         1          1          1
2          0          0         0          1          1
3          1          1         1          1          1
4          0          0         0          0          1
5          0          1         1          1          1
6          0          0         0          0          0
{1} vs. {2, 3} {1, 2} vs. {3} {1, 3} vs. {2}
1              1              1              1
2              1              1              1
3              0              1              0
4              0              1              0
5              0              0              1
6              0              0              1
{1} vs. {2, 3} {1, 2} vs. {3}
1              1              1
2              1              1
3              0              1
4              0              1
5              0              0
6              0              0

[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 1
[7,] 2
[8,] 3
[9,] 4
[10,] 5
[11,] 1
[12,] 2
[13,] 3
[14,] 4
[15,] 5
[16,] 1
[17,] 2
[18,] 3
[19,] 4
[20,] 5
attr(,"assign")
[1] 1
2 - 1 3 - 1 3 - 2
1    -1    -1     0
2    -1    -1     0
3     1     0    -1
4     1     0    -1
5     0     1     1
6     0     1     1
attr(,"assign")
[1] 1 1 1
attr(,"contrast")

Multiple Comparisons of Means: Tukey Contrasts

1  2 3
2 - 1 -1  1 0
3 - 1 -1  0 1
3 - 2  0 -1 1
2 - 1 3 - 1 3 - 2
1    -1    -1     0
2    -1    -1     0
3     1     0    -1
4     1     0    -1
5     0     1     1
6     0     1     1
attr(,"assign")
[1] 1 1 1
attr(,"contrast")

Multiple Comparisons of Means: User-defined Contrasts

1  2 3
2 - 1 -1  1 0
3 - 1 -1  0 1
3 - 2  0 -1 1
```

coin documentation built on Feb. 8, 2021, 5:06 p.m.