Description Usage Arguments Details Value Author(s) See Also Examples

Function to calculate groupwise summary statistics, much like the summary procedure of SAS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ```
summary_by(
data,
formula,
id = NULL,
FUN = mean,
keep.names = FALSE,
p2d = FALSE,
order = TRUE,
full.dimension = FALSE,
var.names = NULL,
fun.names = NULL,
...
)
summaryBy(
formula,
data = parent.frame(),
id = NULL,
FUN = mean,
keep.names = FALSE,
p2d = FALSE,
order = TRUE,
full.dimension = FALSE,
var.names = NULL,
fun.names = NULL,
...
)
``` |

`data` |
A data frame. |

`formula` |
A formula object, see examples below. |

`id` |
A formula specifying variables which data are not grouped by but which should appear in the output. See examples below. |

`FUN` |
A list of functions to be applied, see examples below. |

`keep.names` |
If TRUE and if there is only ONE function in FUN, then the variables in the output will have the same name as the variables in the input, see 'examples'. |

`p2d` |
Should parentheses in output variable names be replaced by dots? |

`order` |
Should the resulting dataframe be ordered according to the variables on the right hand side of the formula? (using orderBy |

`full.dimension` |
If TRUE then rows of summary statistics are repeated such that the result will have the same number of rows as the input dataset. |

`var.names` |
Option for user to specify the names of the variables on the left hand side. |

`fun.names` |
Option for user to specify function names to apply to the variables on the left hand side. |

`...` |
Additional arguments to FUN. This could for example be NA actions. |

Extra arguments ('...') are passed onto the functions in FUN. Hence care must be taken that all functions in FUN accept these arguments - OR one can explicitly write a functions which get around this. This can particularly be an issue in connection with handling NAs. See examples below. Some code for this function has been suggested by Jim Robison-Cox. Thanks.

A dataframe.

Søren Højsgaard, sorenh@math.aau.dk

`ave`

, `descStat`

, `orderBy`

,
`splitBy`

, `transformBy`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | ```
data(dietox)
dietox12 <- subset(dietox,Time==12)
fun <- function(x){
c(m=mean(x), v=var(x), n=length(x))
}
summaryBy(cbind(Weight, Feed) ~ Evit + Cu, data=dietox12,
FUN=fun)
summaryBy(list(c("Weight", "Feed"), c("Evit", "Cu")), data=dietox12,
FUN=fun)
## Computations on several variables is done using cbind( )
summaryBy(cbind(Weight, Feed) ~ Evit + Cu, data=subset(dietox, Time > 1),
FUN=fun)
## Calculations on transformed data is possible using cbind( ), but
# the transformed variables must be named
summaryBy(cbind(lw=log(Weight), Feed) ~ Evit + Cu, data=dietox12, FUN=mean)
## There are missing values in the 'airquality' data, so we remove these
## before calculating mean and variance with 'na.rm=TRUE'. However the
## length function does not accept any such argument. Hence we get
## around this by defining our own summary function in which length is
## not supplied with this argument while mean and var are:
sumfun <- function(x, ...){
c(m=mean(x, na.rm=TRUE, ...), v=var(x, na.rm=TRUE, ...), l=length(x))
}
summaryBy(cbind(Ozone, Solar.R) ~ Month, data=airquality, FUN=sumfun )
## Using '.' on the right hand side of a formula means to stratify by
## all variables not used elsewhere:
data(warpbreaks)
summaryBy(breaks ~ wool + tension, warpbreaks, FUN=mean)
summaryBy(breaks ~ ., warpbreaks, FUN=mean)
summaryBy(. ~ wool + tension, warpbreaks, FUN=mean)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.