# grouped.data: Grouped data In actuar: Actuarial Functions and Heavy Tailed Distributions

 grouped.data R Documentation

## Grouped data

### Description

Creation of grouped data objects, from either a provided set of group boundaries and group frequencies, or from individual data using automatic or specified breakpoints.

### Usage

``````grouped.data(..., breaks = "Sturges", include.lowest = TRUE,
right = TRUE, nclass = NULL, group = FALSE,
row.names = NULL, check.rows = FALSE,
check.names = TRUE)
``````

### Arguments

 `...` arguments of the form `value` or `tag = value`; see Details. `breaks` same as for `hist`, namely one of: a vector giving the breakpoints between groups; a function to compute the vector of breakpoints; a single number giving the number of groups; a character string naming an algorithm to compute the number of groups (see `hist`); a function to compute the number of groups. In the last three cases the number is a suggestion only; the breakpoints will be set to `pretty` values. If `breaks` is a function, the first element in `...` is supplied to it as the only argument. `include.lowest` logical; if `TRUE`, a data point equal to the `breaks` value will be included in the first (or last, for `right = FALSE`) group. Used only for individual data; see Details. `right` logical; indicating if the intervals should be closed on the right (and open on the left) or vice versa. `nclass` numeric (integer); equivalent to `breaks` for a scalar or character argument. `group` logical; an alternative way to force grouping of individual data. `row.names, check.rows, check.names` arguments identical to those of `data.frame`.

### Details

A grouped data object is a special form of data frame consisting of one column of contiguous group boundaries and one or more columns of frequencies within each group.

The function can create a grouped data object from two types of arguments.

1. Group boundaries and frequencies. This is the default mode of operation if the call has at least two elements in `...`.

The first argument will then be taken as the vector of group boundaries. This vector must be exactly one element longer than the other arguments, which will be taken as vectors of group frequencies. All arguments are coerced to data frames.

2. Individual data. This mode of operation is active if there is a single argument in `...`, or if either `breaks` or `nclass` is specified or `group` is `TRUE`.

Arguments of `...` are first grouped using `hist`. If needed, breakpoints are set using the first argument.

Missing (`NA`) frequencies are replaced by zeros, with a warning.

Extraction and replacement methods exist for `grouped.data` objects, but working on non adjacent groups will most likely yield useless results.

### Value

An object of `class` `c("grouped.data", "data.frame")` with an environment containing the vector `cj` of group boundaries.

### Author(s)

Vincent Goulet vincent.goulet@act.ulaval.ca, Mathieu Pigeon and Louis-Philippe Pouliot

### References

Klugman, S. A., Panjer, H. H. and Willmot, G. E. (1998), Loss Models, From Data to Decisions, Wiley.

`[.grouped.data` for extraction and replacement methods.

`data.frame` for usual data frame creation and manipulation.

`hist` for details on the calculation of breakpoints.

### Examples

``````## Most common usage using a predetermined set of group
## boundaries and group frequencies.
cj <- c(0, 25, 50, 100, 250, 500, 1000)
nj <- c(30, 31, 57, 42, 45, 10)
(x <- grouped.data(Group = cj, Frequency = nj))
class(x)

x[, 1] # group boundaries
x[, 2] # group frequencies

## Multiple frequency columns are supported
x <- sample(1:100, 9)
y <- sample(1:100, 9)
grouped.data(cj = 1:10, nj.1 = x, nj.2 = y)

## Alternative usage with grouping of individual data.
grouped.data(x)                         # automatic breakpoints
grouped.data(x, breaks = 7)             # forced number of groups
grouped.data(x, breaks = c(0,25,75,100))    # specified groups
grouped.data(x, y, breaks = c(0,25,75,100)) # multiple data sets

## Not run: ## Providing two or more data sets and automatic breakpoints is
## very error-prone since the range of the first data set has to
## include the ranges of all the other data sets.
range(x)
range(y)
grouped.data(x, y, group = TRUE)
## End(Not run)
``````

actuar documentation built on Nov. 8, 2023, 9:06 a.m.