pct_routine: Calculate percentage by group.

Description Usage Arguments Functions Examples

View source: R/pct_routine.R

Description

pct_routine works like count except that it returns group percentages instead of counts. tally_pct is a underlying utility function that corresponds to tally. As the name implies, it also returns percentage.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
pct_routine(
  data,
  ...,
  wt = NULL,
  ret_name = "pct",
  rebase = FALSE,
  ungroup = FALSE
)

pct_routine_(
  data,
  vars,
  wt = NULL,
  ret_name = "pct",
  rebase = FALSE,
  ungroup = FALSE
)

tally_pct(data, wt = NULL, ret_name = "pct", rebase = FALSE)

tally_pct_(data, wt = NULL, ret_name = "pct", rebase = FALSE)

Arguments

data

A data.frame or tbl.

...

Variables to group by, see group_by.

wt

Column name of weights.

ret_name

Character of the variable name returned.

rebase

Whether to remove the missing values in the percentage, e.g. rebase the percentage so that NAs in the last group are excluded.

ungroup

Whether to ungroup the returned table.

vars

A character vector of variable names to group by.

Functions

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
data(esoph)
esoph
pct_routine(esoph, agegp, alcgp)
pct_routine(esoph, agegp, alcgp, wt = ncases)
# Crate new grouping variables
pct_routine(esoph, agegp, low_alcgp = alcgp %in% c("0-39g/day", "40-79"))


# This examples shows how rebase works
if (require(dplyr)) {
  iris %>%
    mutate(random_missing = ifelse(rnorm(n()) > 0, NA, round(Sepal.Length))) %>%
    group_by(Species, random_missing) %>%
    tally_pct(wt = Sepal.Width, rebase = TRUE)
}

Example output

   agegp     alcgp    tobgp ncases ncontrols
1  25-34 0-39g/day 0-9g/day      0        40
2  25-34 0-39g/day    10-19      0        10
3  25-34 0-39g/day    20-29      0         6
4  25-34 0-39g/day      30+      0         5
5  25-34     40-79 0-9g/day      0        27
6  25-34     40-79    10-19      0         7
7  25-34     40-79    20-29      0         4
8  25-34     40-79      30+      0         7
9  25-34    80-119 0-9g/day      0         2
10 25-34    80-119    10-19      0         1
11 25-34    80-119      30+      0         2
12 25-34      120+ 0-9g/day      0         1
13 25-34      120+    10-19      1         1
14 25-34      120+    20-29      0         1
15 25-34      120+      30+      0         2
16 35-44 0-39g/day 0-9g/day      0        60
17 35-44 0-39g/day    10-19      1        14
18 35-44 0-39g/day    20-29      0         7
19 35-44 0-39g/day      30+      0         8
20 35-44     40-79 0-9g/day      0        35
21 35-44     40-79    10-19      3        23
22 35-44     40-79    20-29      1        14
23 35-44     40-79      30+      0         8
24 35-44    80-119 0-9g/day      0        11
25 35-44    80-119    10-19      0         6
26 35-44    80-119    20-29      0         2
27 35-44    80-119      30+      0         1
28 35-44      120+ 0-9g/day      2         3
29 35-44      120+    10-19      0         3
30 35-44      120+    20-29      2         4
31 45-54 0-39g/day 0-9g/day      1        46
32 45-54 0-39g/day    10-19      0        18
33 45-54 0-39g/day    20-29      0        10
34 45-54 0-39g/day      30+      0         4
35 45-54     40-79 0-9g/day      6        38
36 45-54     40-79    10-19      4        21
37 45-54     40-79    20-29      5        15
38 45-54     40-79      30+      5         7
39 45-54    80-119 0-9g/day      3        16
40 45-54    80-119    10-19      6        14
41 45-54    80-119    20-29      1         5
42 45-54    80-119      30+      2         4
43 45-54      120+ 0-9g/day      4         4
44 45-54      120+    10-19      3         4
45 45-54      120+    20-29      2         3
46 45-54      120+      30+      4         4
47 55-64 0-39g/day 0-9g/day      2        49
48 55-64 0-39g/day    10-19      3        22
49 55-64 0-39g/day    20-29      3        12
50 55-64 0-39g/day      30+      4         6
51 55-64     40-79 0-9g/day      9        40
52 55-64     40-79    10-19      6        21
53 55-64     40-79    20-29      4        17
54 55-64     40-79      30+      3         6
55 55-64    80-119 0-9g/day      9        18
56 55-64    80-119    10-19      8        15
57 55-64    80-119    20-29      3         6
58 55-64    80-119      30+      4         4
59 55-64      120+ 0-9g/day      5        10
60 55-64      120+    10-19      6         7
61 55-64      120+    20-29      2         3
62 55-64      120+      30+      5         6
63 65-74 0-39g/day 0-9g/day      5        48
64 65-74 0-39g/day    10-19      4        14
65 65-74 0-39g/day    20-29      2         7
66 65-74 0-39g/day      30+      0         2
67 65-74     40-79 0-9g/day     17        34
68 65-74     40-79    10-19      3        10
69 65-74     40-79    20-29      5         9
70 65-74    80-119 0-9g/day      6        13
71 65-74    80-119    10-19      4        12
72 65-74    80-119    20-29      2         3
73 65-74    80-119      30+      1         1
74 65-74      120+ 0-9g/day      3         4
75 65-74      120+    10-19      1         2
76 65-74      120+    20-29      1         1
77 65-74      120+      30+      1         1
78   75+ 0-39g/day 0-9g/day      1        18
79   75+ 0-39g/day    10-19      2         6
80   75+ 0-39g/day      30+      1         3
81   75+     40-79 0-9g/day      2         5
82   75+     40-79    10-19      1         3
83   75+     40-79    20-29      0         3
84   75+     40-79      30+      1         1
85   75+    80-119 0-9g/day      1         1
86   75+    80-119    10-19      1         1
87   75+      120+ 0-9g/day      2         2
88   75+      120+    10-19      1         1
# A tibble: 24 x 3
# Groups:   agegp [6]
   agegp     alcgp       pct
   <ord>     <ord>     <dbl>
 1 25-34 0-39g/day 0.2666667
 2 25-34     40-79 0.2666667
 3 25-34    80-119 0.2000000
 4 25-34      120+ 0.2666667
 5 35-44 0-39g/day 0.2666667
 6 35-44     40-79 0.2666667
 7 35-44    80-119 0.2666667
 8 35-44      120+ 0.2000000
 9 45-54 0-39g/day 0.2500000
10 45-54     40-79 0.2500000
# ... with 14 more rows
# A tibble: 24 x 3
# Groups:   agegp [6]
   agegp     alcgp        pct
   <ord>     <ord>      <dbl>
 1 25-34 0-39g/day 0.00000000
 2 25-34     40-79 0.00000000
 3 25-34    80-119 0.00000000
 4 25-34      120+ 1.00000000
 5 35-44 0-39g/day 0.11111111
 6 35-44     40-79 0.44444444
 7 35-44    80-119 0.00000000
 8 35-44      120+ 0.44444444
 9 45-54 0-39g/day 0.02173913
10 45-54     40-79 0.43478261
# ... with 14 more rows
# A tibble: 12 x 3
# Groups:   agegp [6]
   agegp low_alcgp       pct
   <ord>     <lgl>     <dbl>
 1 25-34     FALSE 0.4666667
 2 25-34      TRUE 0.5333333
 3 35-44     FALSE 0.4666667
 4 35-44      TRUE 0.5333333
 5 45-54     FALSE 0.5000000
 6 45-54      TRUE 0.5000000
 7 55-64     FALSE 0.5000000
 8 55-64      TRUE 0.5000000
 9 65-74     FALSE 0.5333333
10 65-74      TRUE 0.4666667
11   75+     FALSE 0.3636364
12   75+      TRUE 0.6363636
Loading required package: dplyr

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

# A tibble: 10 x 3
# Groups:   Species [3]
      Species random_missing        pct
       <fctr>          <dbl>      <dbl>
 1     setosa              4 0.04632588
 2     setosa              5 0.88658147
 3     setosa              6 0.06709265
 4 versicolor              5 0.04244482
 5 versicolor              6 0.85398981
 6 versicolor              7 0.10356537
 7  virginica              5 0.04237288
 8  virginica              6 0.57966102
 9  virginica              7 0.26271186
10  virginica              8 0.11525424

extdplyr documentation built on April 20, 2020, 9:04 a.m.