fmode | R Documentation |
fmode
is a generic function and returns the (column-wise) statistical mode i.e. the most frequent value of x
, (optionally) grouped by g
and/or weighted by w
.
The TRA
argument can further be used to transform x
using its (grouped, weighted) mode. Ties between multiple possible modes can be resolved by taking the minimum, maximum, (default) first or last occurring mode.
fmode(x, ...)
## Default S3 method:
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = .op[["na.rm"]],
use.g.names = TRUE, ties = "first", nthreads = .op[["nthreads"]], ...)
## S3 method for class 'matrix'
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = .op[["na.rm"]],
use.g.names = TRUE, drop = TRUE, ties = "first", nthreads = .op[["nthreads"]], ...)
## S3 method for class 'data.frame'
fmode(x, g = NULL, w = NULL, TRA = NULL, na.rm = .op[["na.rm"]],
use.g.names = TRUE, drop = TRUE, ties = "first", nthreads = .op[["nthreads"]], ...)
## S3 method for class 'grouped_df'
fmode(x, w = NULL, TRA = NULL, na.rm = .op[["na.rm"]],
use.g.names = FALSE, keep.group_vars = TRUE, keep.w = TRUE, stub = .op[["stub"]],
ties = "first", nthreads = .op[["nthreads"]], ...)
x |
a vector, matrix, data frame or grouped data frame (class 'grouped_df'). | ||||||||||||||||||||||||||
g |
a factor, | ||||||||||||||||||||||||||
w |
a numeric vector of (non-negative) weights, may contain missing values. | ||||||||||||||||||||||||||
TRA |
an integer or quoted operator indicating the transformation to perform:
0 - "na" | 1 - "fill" | 2 - "replace" | 3 - "-" | 4 - "-+" | 5 - "/" | 6 - "%" | 7 - "+" | 8 - "*" | 9 - "%%" | 10 - "-%%". See | ||||||||||||||||||||||||||
na.rm |
logical. Skip missing values in | ||||||||||||||||||||||||||
use.g.names |
logical. Make group-names and add to the result as names (default method) or row-names (matrix and data frame methods). No row-names are generated for data.table's. | ||||||||||||||||||||||||||
ties |
an integer or character string specifying the method to resolve ties between multiple possible modes i.e. multiple values with the maximum frequency or sum of weights:
Note: | ||||||||||||||||||||||||||
nthreads |
integer. The number of threads to utilize. Parallelism is across groups for grouped computations and at the column-level otherwise. | ||||||||||||||||||||||||||
drop |
matrix and data.frame method: Logical. | ||||||||||||||||||||||||||
keep.group_vars |
grouped_df method: Logical. | ||||||||||||||||||||||||||
keep.w |
grouped_df method: Logical. Retain | ||||||||||||||||||||||||||
stub |
character. If | ||||||||||||||||||||||||||
... |
arguments to be passed to or from other methods. If |
fmode
implements a pretty fast C-level hashing algorithm inspired by the kit package to find the statistical mode.
If na.rm = FALSE
, NA
is not removed but treated as any other value (i.e. its frequency is counted). If all values are NA
, NA
is always returned.
The weighted mode is computed by summing up the weights for all distinct values and choosing the value with the largest sum. If na.rm = TRUE
, missing values will be removed from both x
and w
i.e. utilizing only x[complete.cases(x,w)]
and w[complete.cases(x,w)]
.
It is possible that multiple values have the same mode (the maximum frequency or sum of weights). Typical cases are simply when all values are either all the same or all distinct. In such cases, the default option ties = "first"
returns the first occurring value in the data reaching the maximum frequency count or sum of weights. For example in a sample x = c(1, 3, 2, 2, 4, 4, 1, 7)
, the first mode is 2 as fmode
goes through the data from left to right. ties = "last"
on the other hand gives 1. It is also possible to take the minimum or maximum mode, i.e. fmode(x, ties = "min")
returns 1, and fmode(x, ties = "max")
returns 4. It should be noted that options ties = "min"
and ties = "max"
give unintuitive results for character data (no strict alphabetic sorting, similar to using <
and >
to compare character values in R). These options are also best avoided if missing values are counted (na.rm = FALSE
) since no proper logical comparison with missing values is possible: With numeric data it depends, since in C++ any comparison with NA_real_
evaluates to FALSE
, NA_real_
is chosen as the min or max mode only if it is also the first mode, and never otherwise. For integer data, NA_integer_
is stored as the smallest integer in C++, so it will always be chosen as the min mode and never as the max mode. For character data, NA_character_
is stored as the string "NA"
in C++ and thus the behavior depends on the other character content.
fmode
preserves all the attributes of the objects it is applied to (apart from names or row-names which are adjusted as necessary in grouped operations). If a data frame is passed to fmode
and drop = TRUE
(the default), unlist
will be called on the result, which might not be sensible depending on the data at hand.
The (w
weighted) statistical mode of x
, grouped by g
, or (if TRA
is used) x
transformed by its (grouped, weighed) mode.
fmean
, fmedian
, Fast Statistical Functions, Collapse Overview
x <- c(1, 3, 2, 2, 4, 4, 1, 7, NA, NA, NA)
fmode(x) # Default is ties = "first"
fmode(x, ties = "last")
fmode(x, ties = "min")
fmode(x, ties = "max")
fmode(x, na.rm = FALSE) # Here NA is the mode, regardless of ties option
fmode(x[-length(x)], na.rm = FALSE) # Not anymore..
## World Development Data
attach(wlddev)
## default vector method
fmode(PCGDP) # Numeric mode
head(fmode(PCGDP, iso3c)) # Grouped numeric mode
head(fmode(PCGDP, iso3c, LIFEEX)) # Grouped and weighted numeric mode
fmode(region) # Factor mode
fmode(date) # Date mode (defaults to first value since panel is balanced)
fmode(country) # Character mode (also defaults to first value)
fmode(OECD) # Logical mode
# ..all the above can also be performed grouped and weighted
## matrix method
m <- qM(airquality)
fmode(m)
fmode(m, na.rm = FALSE) # NA frequency is also counted
fmode(m, airquality$Month) # Groupwise
fmode(m, w = airquality$Day) # Weighted: Later days in the month are given more weight
fmode(m>50, airquality$Month) # Groupwise logical mode
# etc..
## data.frame method
fmode(wlddev) # Calling unlist -> coerce to character vector
fmode(wlddev, drop = FALSE) # Gives one row
head(fmode(wlddev, iso3c)) # Grouped mode
head(fmode(wlddev, iso3c, LIFEEX)) # Grouped and weighted mode
detach(wlddev)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.