aggregating: Aggregating functions

Description Usage Arguments Details Note Examples

Description

The mosaic package makes several summary statistic functions (like mean and sd) formula aware.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
mean_(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

mean(x, ...)

median(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

range(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

sd(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

max(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

min(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

sum(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

IQR(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

fivenum(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

iqr(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

prod(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

sum(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

favstats(x, ..., data = NULL, groups = NULL, na.rm = TRUE)

quantile(x, ..., data = NULL, groups = NULL, na.rm = getOption("na.rm", FALSE))

var(x, y = NULL, na.rm = getOption("na.rm", FALSE), ..., data = NULL)

cor(x, y = NULL, ..., data = NULL)

cov(x, y = NULL, ..., data = NULL)

Arguments

x

a numeric vector or a formula

...

additional arguments

data

a data frame in which to evaluate formulas (or bare names). Note that the default is data = parent.frame(). This makes it convenient to use this function interactively by treating the working environment as if it were a data frame. But this may not be appropriate for programming uses. When programming, it is best to use an explicit data argument – ideally supplying a data frame that contains the variables mentioned.

groups

a grouping variable, typically a name of a variable in data

na.rm

a logical indicating whether NAs should be removed before computing

y

a numeric vector or a formula

Details

Many of these functions mask core R functions to provide an additional formula interface. Old behavior should be unchanged. But if the first argument is a formula, that formula, together with data are used to generate the numeric vector(s) to be summarized. Formulas of the shape x ~ a or ~ x | a can be used to produce summaries of x for each subset defined by a. Two-way aggregation can be achieved using formulas of the form x ~ a + b or x ~ a | b. See the examples.

Note

Earlier versions of these functions supported a "bare name + data frame" interface. This functionality has been removed since it was (a) ambiguous in some cases, (b) unnecessary, and (c) difficult to maintain.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mean(HELPrct$age)
mean( ~ age, data = HELPrct)
mean( ~ drugrisk, na.rm = TRUE, data = HELPrct)
mean(age ~ shuffle(sex), data = HELPrct)
mean(age ~ shuffle(sex), data = HELPrct, .format = "table")
# wrap in data.frame() to auto-convert awkward variable names
data.frame(mean(age ~ shuffle(sex), data = HELPrct, .format = "table"))
mean(age ~ sex + substance, data = HELPrct)
mean( ~ age | sex + substance, data = HELPrct)
mean( ~ sqrt(age), data = HELPrct)
sum( ~ age, data = HELPrct)
sd(HELPrct$age)
sd( ~ age, data = HELPrct)
sd(age ~ sex + substance, data = HELPrct)
var(HELPrct$age)
var( ~ age, data = HELPrct)
var(age ~ sex + substance, data = HELPrct)
IQR(width ~ sex, data = KidsFeet)
iqr(width ~ sex, data = KidsFeet)
favstats(width ~ sex, data = KidsFeet)

cor(length ~ width, data = KidsFeet)
cov(length ~ width, data = KidsFeet)
tally(is.na(mcs) ~ is.na(pcs), data = HELPmiss)
cov(mcs ~ pcs, data = HELPmiss)             # NA because of missing data
cov(mcs ~ pcs, data = HELPmiss, use = "complete")  # ignore missing data
# alternative approach using filter explicitly
cov(mcs ~ pcs, data = HELPmiss %>% filter(!is.na(mcs) & !is.na(pcs)))    

Example output

Loading required package: dplyr

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: lattice
Loading required package: ggformula
Loading required package: ggplot2

New to ggformula?  Try the tutorials: 
	learnr::run_tutorial("introduction", package = "ggformula")
	learnr::run_tutorial("refining", package = "ggformula")
Loading required package: mosaicData
Loading required package: Matrix

The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.

Note: If you use the Matrix package, be sure to load it BEFORE loading mosaic.

Attaching package: 'mosaic'

The following object is masked from 'package:Matrix':

    mean

The following objects are masked from 'package:dplyr':

    count, do, tally

The following objects are masked from 'package:stats':

    IQR, binom.test, cor, cor.test, cov, fivenum, median, prop.test,
    quantile, sd, t.test, var

The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum

[1] 35.65342
[1] 35.65342
[1] 1.887168
  female     male 
35.71028 35.63584 
  shuffle(sex)     mean
1       female 35.94393
2         male 35.56358
  shuffle.sex.     mean
1       female 35.28972
2         male  35.7659
female.alcohol   male.alcohol female.cocaine   male.cocaine  female.heroin 
      39.16667       37.95035       34.85366       34.36036       34.66667 
   male.heroin 
      33.05319 
female.alcohol   male.alcohol female.cocaine   male.cocaine  female.heroin 
      39.16667       37.95035       34.85366       34.36036       34.66667 
   male.heroin 
      33.05319 
[1] 5.936703
[1] 16151
[1] 7.710266
[1] 7.710266
female.alcohol   male.alcohol female.cocaine   male.cocaine  female.heroin 
      7.980333       7.575644       6.195002       6.889772       8.035839 
   male.heroin 
      7.973568 
[1] 59.4482
[1] 59.4482
female.alcohol   male.alcohol female.cocaine   male.cocaine  female.heroin 
      63.68571       57.39037       38.37805       47.46896       64.57471 
   male.heroin 
      63.57779 
   B    G 
0.75 0.60 
   B    G 
0.75 0.60 
  sex min    Q1 median    Q3 max     mean        sd  n missing
1   B 8.4 8.875   9.15 9.625 9.8 9.190000 0.4517801 20       0
2   G 7.9 8.550   8.80 9.150 9.5 8.784211 0.4935846 19       0
[1] 0.6410961
[1] 0.4304453
          is.na(pcs)
is.na(mcs) TRUE FALSE
     TRUE     2     0
     FALSE    0   468
[1] NA
[1] 13.46433
[1] 13.46433

mosaic documentation built on Jan. 18, 2021, 5:09 p.m.