reduceDataFrame  R Documentation 
DataFrame
A long dataframe can be reduced by mergeing certain rows into a
single one. These new variables are constructed as a SimpleList
containing all the original values. Invariant columns, i.e columns
that have the same value along all the rows that need to be
merged, can be shrunk into a new variables containing that
invariant value (rather than in list columns). The grouping of
rows, i.e. the rows that need to be shrunk together as one, is
defined by a vector.
The opposite operation is expand. But note that for a
DataFrame
to be expanded back, it must not to be simplified.
reduceDataFrame(x, k, count = FALSE, simplify = TRUE, drop = FALSE)
expandDataFrame(x, k = NULL)
x 
The 
k 
A â€˜vectorâ€™ of length 
count 

simplify 
A 
drop 
A 
An expanded (reduced) DataFrame
.
Missing values do have an important effect on reduce
. Unless all
values to be reduces are missing, they will result in an
noninvariant column, and will be dropped with drop = TRUE
. See
the example below.
The presence of missing values can have side effects in higher
level functions that rely on reduction of DataFrame
objects.
Laurent Gatto
library("IRanges")
k < sample(100, 1e3, replace = TRUE)
df < DataFrame(k = k,
x = round(rnorm(length(k)), 2),
y = seq_len(length(k)),
z = sample(LETTERS, length(k), replace = TRUE),
ir = IRanges(seq_along(k), width = 10),
r = Rle(sample(5, length(k), replace = TRUE)),
invar = k + 1)
df
## Shinks the DataFrame
df2 < reduceDataFrame(df, df$k)
df2
## With a tally of the number of members in each group
reduceDataFrame(df, df$k, count = TRUE)
## Much faster, but more crowded result
df3 < reduceDataFrame(df, df$k, simplify = FALSE)
df3
## Drop all noninvariant columns
reduceDataFrame(df, df$k, drop = TRUE)
## Missing values
d < DataFrame(k = rep(1:3, each = 3),
x = letters[1:9],
y = rep(letters[1:3], each = 3),
y2 = rep(letters[1:3], each = 3))
d
## y is invariant and can be simplified
reduceDataFrame(d, d$k)
## y isn't not dropped
reduceDataFrame(d, d$k, drop = TRUE)
## BUT with a missing value
d[1, "y"] < NA
d
## y isn't invariant/simplified anymore
reduceDataFrame(d, d$k)
## y now gets dropped
reduceDataFrame(d, d$k, drop = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.