aggstat: Complement to Aggregate

Description Usage Arguments Value See Also Examples

Description

The function computes a statistic (sum, mean, standard deviation, etc.) on one or several continuous variables, according to levels of one or several categorical variables.

Usage

1
2
3
4
  aggstat(formula, data, FUN, digits = NULL, ...)
  ## S3 method for class 'aggstat'
print(x, ...)
  

Arguments

formula

A formula with the numerical variable(s) on the left-hand side and the categorical variable(s) on the right-hand side. When several numerical variables are included, they must be handled with cbind (e.g., cbind(y1, y2, y3)). When several categorical variables are included, they must be separated by a +, i.e., interactions (denoted by symbols : or *) are not allowed. When there is no variable in the left-hand side of the formula, the function computes the number of rows in the data frame according to the categorical variable(s) (no argument FUN is needed and the resulting variable is n.aggstat).

data

A data frame containing the variables indicated in the formula.

FUN

A function returning a scalar (e.g., mean, sd or median

digits

A scalar indicating the number of decimal digits left when rounding the result of the statictics. Default to NULL (no rounding).

...

Further arguments passed to the function defined in FUN.

x

An object of class “aggstat”.

Value

A list with components CALL, tab, nbfact and digits. Component tab is the data frame built according to the formula and function used in the call.

See Also

aggregate, xtabs, table

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
tmp <- data.frame(
    y1 = c(NA, rnorm(n = 8, mean = 10, sd = 5), NA),
    y2 = c(rep(NA, 2), rnorm(n = 6, mean = 10, sd = 5), rep(NA, 2)),
    y3 = rnorm(n = 10, mean = 10, sd = 5),
    y4 = rnorm(n = 10, mean = 10, sd = 5),
    f1 = rep(c("a", NA, "b"), times = c(3, 1, 6)),
    f2 = rep(c("c", "d", NA), times = c(5, 3, 2)),
    f3 = rep(c("e", "f", "g"), times = c(3, 3, 4))
    )
tmp

aggstat(formula = y1 ~ f1, data = tmp, FUN = mean)
aggstat(formula = y1 ~ f1, data = tmp, FUN = mean, digits = 1)

aggstat(formula = y1 ~ f1 + f2, data = tmp, FUN = mean)

aggstat(formula = y1 ~ 1, data = tmp, FUN = mean)

aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = tmp, FUN = median)
aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = tmp, FUN = quantile,
        prob = 0.5, digit = 1)
tab <- aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = tmp, FUN = quantile,
               prob = 0.5)
tab$tab

# Nb. rows in data frame
aggstat(formula = ~ 1, data = tmp) 
aggstat(formula = ~ f1, data = tmp) 
aggstat(formula = ~ f2, data = tmp)
aggstat(formula = ~ f1 + f2, data = tmp)

  

tdisplay documentation built on May 2, 2019, 4:46 p.m.