sparseby | R Documentation |
Function sparseby
is a modified version of by
for
tapply
applied to data frames. It always returns
a new data frame rather than a multi-way array.
sparseby(data, INDICES = list(), FUN, ..., GROUPNAMES = TRUE)
data |
an R object, normally a data frame, possibly a matrix. |
INDICES |
a variable or list of variables indicating the subgroups of |
FUN |
a function to be applied to data frame subsets of |
... |
further arguments to |
GROUPNAMES |
a logical variable indicating whether the group names should be bound to the result |
A data frame or matrix is split by row into data frames or matrices respectively subsetted by the values of one or more factors, and function FUN
is applied to each subset in turn.
sparseby
is much faster and more memory efficient than by
or tapply
in the situation where the combinations of INDICES
present in the data form a sparse subset of all possible combinations.
A data frame or matrix containing the results of FUN
applied to each subgroup of the matrix. The result depends on what is returned from FUN
:
If FUN
returns NULL
on any subsets, those are dropped.
If it returns a single value or a vector of values, the length must be consistent across all subgroups. These will be returned as values in rows of the resulting data frame or matrix.
If it returns data frames or matrices, they must all have the same number of columns, and they will be bound with rbind
into a single data frame or matrix.
Names for the columns will be taken from the names in the list of INDICES
or from the results of FUN
, as appropriate.
Duncan Murdoch
tapply
, by
x <- data.frame(index=c(rep(1,4),rep(2,3)),value=c(1:7)) x sparseby(x,x$index,nrow) # The version below works entirely in matrices x <- as.matrix(x) sparseby(x,list(group = x[,"index"]), function(subset) c(mean=mean(subset[,2])))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.