# summarizeData: Summarization of data In scmamp: Statistical Comparison of Multiple Algorithms in Multiple Problems

## Description

This is a simple function to apply a summarization function to a matrix or data frame.

## Usage

 `1` ```summarizeData(data, fun = mean, group.by = NULL, ignore = NULL, ...) ```

## Arguments

 `data` A matrix or data frame to be summarized. `fun` Function to be used in the summarization. It can be any function that, taking as first argument a numeric vector, otuputs a numeric value. Typical examples are `mean`, `median`, `min`, `max` or `sd`. `group.by` A vector of either column names or column indices according to which the data will be grouped to be summarized. `ignore` A vector of either column names or column indices of the columns that have to be removed from the output. `...` Additional parameters to the summarization function (`fun`). For example, `na.rm=TRUE` to indicate that the missing values should be ignored.

## Value

A data frame where, for each combination of the values in the columns indicated by `group.by`, each column (except those in `ignore`) contains the summarization of the values in the original matrix that have that combination of values. #' @seealso `filterData`, `writeTabular` and the vignette ```vignette(topic="Data_loading_and_manipulation", package="scmamp")```

## Examples

 ```1 2 3 4 5 6 7``` ```data(data_blum_2015) # Group by size and radius. Get the mean and variance of only the last two # columns. summarizeData (data.blum.2015, group.by=c("Radius","Size"), ignore=3:8, fun=mean, na.rm=TRUE) summarizeData (data.blum.2015, group.by=c("Radius","Size"), ignore=3:8, fun=sd, na.rm=TRUE) ```

scmamp documentation built on May 29, 2017, 12:57 p.m.