standardize | R Documentation |
Performs a standardization of data (z-scoring), i.e., centering and scaling,
so that the data is expressed in terms of standard deviation (i.e., mean = 0,
SD = 1) or Median Absolute Deviance (median = 0, MAD = 1). When applied to a
statistical model, this function extracts the dataset, standardizes it, and
refits the model with this standardized version of the dataset. The
normalize()
function can also be used to scale all numeric variables within
the 0 - 1 range.
For model standardization, see standardize.default()
.
standardize(x, ...)
standardise(x, ...)
## S3 method for class 'numeric'
standardize(
x,
robust = FALSE,
two_sd = FALSE,
weights = NULL,
reference = NULL,
center = NULL,
scale = NULL,
verbose = TRUE,
...
)
## S3 method for class 'factor'
standardize(
x,
robust = FALSE,
two_sd = FALSE,
weights = NULL,
force = FALSE,
verbose = TRUE,
...
)
## S3 method for class 'data.frame'
standardize(
x,
select = NULL,
exclude = NULL,
robust = FALSE,
two_sd = FALSE,
weights = NULL,
reference = NULL,
center = NULL,
scale = NULL,
remove_na = c("none", "selected", "all"),
force = FALSE,
append = FALSE,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
unstandardize(x, ...)
unstandardise(x, ...)
## S3 method for class 'numeric'
unstandardize(
x,
center = NULL,
scale = NULL,
reference = NULL,
robust = FALSE,
two_sd = FALSE,
...
)
## S3 method for class 'data.frame'
unstandardize(
x,
center = NULL,
scale = NULL,
reference = NULL,
robust = FALSE,
two_sd = FALSE,
select = NULL,
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
x |
A (grouped) data frame, a vector or a statistical model (for
|
... |
Arguments passed to or from other methods. |
robust |
Logical, if |
two_sd |
If |
weights |
Can be
|
reference |
A data frame or variable from which the centrality and deviation will be computed instead of from the input variable. Useful for standardizing a subset or new data according to another data frame. |
center , scale |
|
verbose |
Toggle warnings and messages on or off. |
force |
Logical, if |
select |
Variables that will be included when performing the required tasks. Can be either
If |
exclude |
See |
remove_na |
How should missing values ( |
append |
Logical or string. If |
ignore_case |
Logical, if |
regex |
Logical, if |
The standardized object (either a standardize data frame or a statistical model fitted on standardized data).
select
argumentFor most functions that have a select
argument (including this function),
the complete input data frame is returned, even when select
only selects
a range of variables. That is, the function is only applied to those variables
that have a match in select
, while all other variables remain unchanged.
In other words: for this function, select
will not omit any non-included
variables, so that the returned data frame will include all variables
from the input data frame.
When x
is a vector or a data frame with remove_na = "none")
,
missing values are preserved, so the return value has the same length /
number of rows as the original input.
See center()
for grand-mean centering of variables, and
makepredictcall.dw_transformer()
for use in model formulas.
Other transform utilities:
normalize()
,
ranktransform()
,
rescale()
,
reverse()
Other standardize:
standardize.default()
d <- iris[1:4, ]
# vectors
standardise(d$Petal.Length)
# Data frames
# overwrite
standardise(d, select = c("Sepal.Length", "Sepal.Width"))
# append
standardise(d, select = c("Sepal.Length", "Sepal.Width"), append = TRUE)
# append, suffix
standardise(d, select = c("Sepal.Length", "Sepal.Width"), append = "_std")
# standardizing with reference center and scale
d <- data.frame(
a = c(-2, -1, 0, 1, 2),
b = c(3, 4, 5, 6, 7)
)
# default standardization, based on mean and sd of each variable
standardize(d) # means are 0 and 5, sd ~ 1.581139
# standardization, based on mean and sd set to the same values
standardize(d, center = c(0, 5), scale = c(1.581, 1.581))
# standardization, mean and sd for each variable newly defined
standardize(d, center = c(3, 4), scale = c(2, 4))
# standardization, taking same mean and sd for each variable
standardize(d, center = 1, scale = 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.