grouped_summary: Descriptive statistics for multiple variables for all...

Description Usage Arguments Value Examples

View source: R/grouped_summary.R

Description

Descriptive statistics for multiple variables for all grouping variable levels

Usage

1
2
3
4
5
6
7
8
9
grouped_summary(
  data,
  grouping.vars,
  measures = NULL,
  measures.type = "numeric",
  topcount.long = FALSE,
  k = 2L,
  ...
)

Arguments

data

Dataframe from which variables need to be taken.

grouping.vars

A list of grouping variables. Please use unquoted arguments (i.e., use x and not "x").

measures

List variables for which summary needs to computed. If not specified, all variables of type specified in the argument measures.type will be used to calculate summaries. Don't explicitly set measures.type = NULL in function call, which will produce an error because the function will try to find a column in a dataframe named "NULL".

measures.type

A character indicating whether summary for numeric ("numeric") or factor/character ("factor") variables is expected (Default: measures.type = "numeric"). This function can't be used for both numeric and variables simultaneously.

topcount.long

If measures.type = factor, you can get the top counts in long format for plotting purposes. (Default: topcount.long = FALSE).

k

Number of digits.

...

Currently ignored.

Value

Dataframe with descriptive statistics for numeric variables (n, mean, sd, median, min, max).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# for reproducibility
set.seed(123)

# another possibility
groupedstats::grouped_summary(
  data = iris,
  grouping.vars = Species,
  measures = Sepal.Length:Petal.Width,
  measures.type = "numeric"
)

# if no measures are chosen, all relevant columns will be summarized
groupedstats::grouped_summary(
  data = ggplot2::msleep,
  grouping.vars = vore,
  measures.type = "factor"
)

# for factors, you can also convert the dataframe to a long format with counts
groupedstats::grouped_summary(
  data = ggplot2::msleep,
  grouping.vars = c(vore),
  measures = c(genus:order),
  measures.type = "factor",
  topcount.long = TRUE
)

Example output

Registered S3 method overwritten by 'broom.mixed':
  method      from 
  tidy.gamlss broom
# A tibble: 12 x 16
   Species skim_type skim_variable missing complete  mean    sd   min   p25
   <fct>   <chr>     <chr>           <int>    <dbl> <dbl> <dbl> <dbl> <dbl>
 1 setosa  numeric   Sepal.Length        0        1 5.01  0.352   4.3  4.8 
 2 setosa  numeric   Sepal.Width         0        1 3.43  0.379   2.3  3.2 
 3 setosa  numeric   Petal.Length        0        1 1.46  0.174   1    1.4 
 4 setosa  numeric   Petal.Width         0        1 0.246 0.105   0.1  0.2 
 5 versicnumeric   Sepal.Length        0        1 5.94  0.516   4.9  5.6 
 6 versicnumeric   Sepal.Width         0        1 2.77  0.314   2    2.52
 7 versicnumeric   Petal.Length        0        1 4.26  0.470   3    4   
 8 versicnumeric   Petal.Width         0        1 1.33  0.198   1    1.2 
 9 virginnumeric   Sepal.Length        0        1 6.59  0.636   4.9  6.22
10 virginnumeric   Sepal.Width         0        1 2.97  0.322   2.2  2.8 
11 virginnumeric   Petal.Length        0        1 5.55  0.552   4.5  5.1 
12 virginnumeric   Petal.Width         0        1 2.03  0.275   1.4  1.8 
# … with 7 more variables: median <dbl>, p75 <dbl>, max <dbl>, n <int>,
#   std.error <dbl>, mean.conf.low <dbl>, mean.conf.high <dbl>
# A tibble: 20 x 9
   vore  skim_type skim_variable missing complete ordered n_unique top_counts
   <fct> <chr>     <chr>           <int>    <dbl> <lgl>      <int> <chr>     
 1 carni factor    name                0    1     FALSE         19 Arc: 1, B2 carni factor    genus               0    1     FALSE         16 Pan: 3, V3 carni factor    order               0    1     FALSE          6 Car: 12,4 carni factor    conservation        5    0.737 FALSE          6 lc: 5, vu5 herbi factor    name                0    1     FALSE         32 Afr: 1, A6 herbi factor    genus               0    1     FALSE         29 Spe: 3, E7 herbi factor    order               0    1     FALSE          9 Rod: 16,8 herbi factor    conservation        6    0.812 FALSE          6 lc: 10, d9 insefactor    name                0    1     FALSE          5 Big: 1, E10 insefactor    genus               0    1     FALSE          5 Ept: 1, M11 insefactor    order               0    1     FALSE          4 Chi: 2, C12 insefactor    conservation        2    0.6   FALSE          2 lc: 2, en13 omni  factor    name                0    1     FALSE         20 Afr: 1, A14 omni  factor    genus               0    1     FALSE         20 Aot: 1, B15 omni  factor    order               0    1     FALSE          8 Pri: 10,16 omni  factor    conservation       11    0.450 FALSE          2 lc: 8, do17 <NA>  factor    name                0    1     FALSE          7 Dee: 1, D18 <NA>  factor    genus               0    1     FALSE          7 Cal: 1, P19 <NA>  factor    order               0    1     FALSE          5 Rod: 3, D20 <NA>  factor    conservation        5    0.286 FALSE          1 lc: 2, cd# … with 1 more variable: n <int>
# A tibble: 40 x 3
   vore  factor.level count
   <fct> <chr>        <int>
 1 carni Pan              3
 2 carni Vul              2
 3 carni Aci              1
 4 carni Cal              1
 5 carni Car             12
 6 carni Cet              3
 7 carni Cin              1
 8 carni Did              1
 9 herbi Spe              3
10 herbi Equ              2
# … with 30 more rows

groupedstats documentation built on May 29, 2021, 1:06 a.m.