Description Usage Arguments Value Missing values Author(s) Examples
A long dataframe can be reduced by mergeing certain rows into a
single one. These new variables are constructed as a SimpleList
containing all the original values. Invariant columns, i.e columns
that have the same value along all the rows that need to be
merged, can be shrunk into a new variables containing that
invariant value (rather than in list columns). The grouping of
rows, i.e. the rows that need to be shrunk together as one, is
defined by a vector.
The opposite operation is expand. But note that for a
DataFrame
to be expanded back, it must not to be simplified.
1 2 3 | reduceDataFrame(x, k, count = FALSE, simplify = TRUE, drop = FALSE)
expandDataFrame(x, k = NULL)
|
x |
The |
k |
A ‘vector’ of length |
count |
|
simplify |
A |
drop |
A |
An expanded (reduced) DataFrame
.
Missing values do have an important effect on reduce
. Unless all
values to be reduces are missing, they will result in an
non-invariant column, and will be dropped with drop = TRUE
. See
the example below.
The presence of missing values can have side effects in higher
level functions that rely on reduction of DataFrame
objects.
Laurent Gatto
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | library("IRanges")
k <- sample(100, 1e3, replace = TRUE)
df <- DataFrame(k = k,
x = round(rnorm(length(k)), 2),
y = seq_len(length(k)),
z = sample(LETTERS, length(k), replace = TRUE),
ir = IRanges(seq_along(k), width = 10),
r = Rle(sample(5, length(k), replace = TRUE)),
invar = k + 1)
df
## Shinks the DataFrame
df2 <- reduceDataFrame(df, df$k)
df2
## With a tally of the number of members in each group
reduceDataFrame(df, df$k, count = TRUE)
## Much faster, but more crowded result
df3 <- reduceDataFrame(df, df$k, simplify = FALSE)
df3
## Drop all non-invariant columns
reduceDataFrame(df, df$k, drop = TRUE)
## Missing values
d <- DataFrame(k = rep(1:3, each = 3),
x = letters[1:9],
y = rep(letters[1:3], each = 3),
y2 = rep(letters[1:3], each = 3))
d
## y is invariant and can be simplified
reduceDataFrame(d, d$k)
## y isn't not dropped
reduceDataFrame(d, d$k, drop = TRUE)
## BUT with a missing value
d[1, "y"] <- NA
d
## y isn't invariant/simplified anymore
reduceDataFrame(d, d$k)
## y now gets dropped
reduceDataFrame(d, d$k, drop = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.