View source: R/S05_Statistics.R
stats_by_group | R Documentation |
A function to compute assorted univariate statistics for a specified variable in a data frame over desired grouping factors.
stats_by_group(
dtf,
column,
groupings,
statistics = c("M", "SD"),
method = "Student's T",
categories = 1,
width = 0.95,
na.rm = TRUE
)
dtf |
A data frame. |
column |
A character string, the column in |
groupings |
A character vector, the columns in |
statistics |
A character vector, the set of different statistics to compute over groups. |
method |
A character string, the type of method to use
when computing uncertainty intervals. Options include:
|
categories |
An optional vector of elements to match over when computing frequencies, proportions, or percentages. |
width |
A numeric value between 0 and 1, the width for uncertainty intervals. |
na.rm |
A logical value; if |
Possible univariate statistics that can be computed:
'N'
= Sample size;
'M'
= Mean;
'Md'
= Median;
'SD'
= Standard deviation;
'SE'
= Standard error of the mean;
'C'
= Counts/frequencies;
'Pr'
= Proportions;
'P'
= Percentages.
Additionally, specifying 'UI'
in combination with the
argument method
will compute the lower and upper limits
of a desired uncertainty interval. The width of the interval
can be controlled by the argument width
.
A data frame with separate rows for each combination of grouping factors and separate columns for each statistic to compute.
# Example data set
data(iris)
dtf <- iris
# Mean/SD for sepal length by species
dtf |> stats_by_group( 'Sepal.Length', 'Species' )
# Create additional categorical variable
dtf$Long_petal <- c( 'No', 'Yes' )[
( dtf$Petal.Length > median( dtf$Petal.Length) ) + 1
]
# Sample size, mean, and confidence intervals using Student's T
# distribution by species and whether petals are long
dtf |> stats_by_group(
'Sepal.Length', c( 'Species', 'Long_petal' ), c( 'N', 'M', 'UI' )
)
# Create additional categorical variable
dtf$Long_sepal <- c( 'No', 'Yes' )[
( dtf$Sepal.Length > median( dtf$Sepal.Length) ) + 1
]
# Proportion and confidence intervals based on beta-binomial
# distribution for long sepals by long petals
dtf |> stats_by_group(
'Long_sepal', c( 'Long_petal' ), c( 'N', 'Pr', 'UI' ),
categories = 'Yes', method = 'Beta-binomial'
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.