Description Usage Arguments Details Value Examples
Estimates the mean, median, and mode of already grouped data given the interval ranges and the frequencies of each group.
1 2 3 4 5 | grouped_mean(frequencies, intervals, sep = NULL, trim = NULL)
grouped_mode(frequencies, intervals, sep = NULL, trim = NULL, method = 1)
grouped_median(frequencies, intervals, sep = NULL, trim = NULL)
|
frequencies |
A vector of frequencies. |
intervals |
A 2-column |
sep |
Optional character that separates lower and uppper
class boundaries if |
trim |
Optional leading or trailing characters to trim from
the character vector being used for |
method |
A single value (1 or 2) determining which method will be used to estimate the grouped mode. See the notes section for the different approaches. |
The following formula is used to calculate the grouped mean:
M = (sum f * x)/n
Where:
f = The frequency of each class
x = The width of each class
n = The sum of the frequencies
The following forumla is used to calculate the grouped median:
M = L + (n/2 - cf)/f * c
Where:
L = The lower boundary of the median class
n = The sum of the frequencies
cf = The cumulative frequency of the class below the median class
f = The frequency of the median class
c = The length of the median class
The following formula is used to calculate the grouped mode if
method = 1
:
Z = L + ((f1 - f0) / (2 * f1 - f0 - f2)) * c
Where:
L = The lower boundary of the mode class
f1 = The frequency of the mode class
f0 = The frequency of the class before the mode class
f2 = The frequency of the class after the mode class
c = The length of the mode class
Keep in mind that while it might be easy to say which is the modal group, the mode of the source data may not even be in that group. Additionally, it is possible for data to have more than one mode or conversely, no mode.
The following formula is used to calculate the grouped mode if
method = 2
:
M = (3 * x) - (2 * y)
Where:
x = The group median
y = The group mean
A single numeric value representing the grouped mean, median, or mode, depending on which function was called.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | mydf <- structure(list(salary = c("1500-1600", "1600-1700", "1700-1800",
"1800-1900", "1900-2000", "2000-2100", "2100-2200", "2200-2300",
"2300-2400", "2400-2500"), number = c(110L, 180L, 320L, 460L,
850L, 250L, 130L, 70L, 20L, 10L)), .Names = c("salary", "number"),
class = "data.frame", row.names = c(NA, -10L))
mydf
with(mydf, grouped_median(frequencies = number, intervals = salary, sep = "-"))
## Example with intervals manually specified
Freq <- mydf$number
X <- cbind(c(1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400),
c(1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500))
grouped_median(Freq, X)
# Using `cut`
set.seed(1)
x <- sample(100, 100, replace = TRUE)
y <- data.frame(table(cut(x, 10)))
with(y, grouped_mean(Freq, Var1, sep = ",", trim = "cut"))
mean(x)
with(y, grouped_median(Freq, Var1, sep = ",", trim = "cut"))
median(x)
## Note that the mode might be really far off depending on the approach used
with(y, grouped_mode(Freq, Var1, sep = ",", trim = "cut"))
with(y, grouped_mode(Freq, Var1, sep = ",", trim = "cut", method = 2))
tail(sort(table(x)))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.