Gaussianize | R Documentation |
Gaussianize
is probably the most useful function in this package. It
works the same way as scale
, but instead of just
centering and scaling the data, it actually Gaussianizes the data
(works well for unimodal data). See Goerg (2011, 2016) and Examples.
Important: For multivariate input X
it performs a column-wise
Gaussianization (by simply calling apply(X, 2, Gaussianize)
),
which is only a marginal Gaussianization. This does not mean (and
is in general definitely not the case) that the transformed data is then
jointly Gaussian.
By default Gaussianize
returns the X \sim N(\mu_x, \sigma_x^2)
input, not the zero-mean, unit-variance U \sim N(0, 1)
input. Use
return.u = TRUE
to obtain U
.
Gaussianize(
data = NULL,
type = c("h", "hh", "s"),
method = c("IGMM", "MLE"),
return.tau.mat = FALSE,
inverse = FALSE,
tau.mat = NULL,
verbose = FALSE,
return.u = FALSE,
input.u = NULL
)
data |
a numeric matrix-like object; either the data that should be
Gaussianized; or the data that should ”DeGaussianized” ( |
type |
what type of non-normality: symmetric heavy-tails |
method |
what estimator should be used: |
return.tau.mat |
logical; if |
inverse |
logical; if |
tau.mat |
instead of estimating |
verbose |
logical; if |
return.u |
logical; if |
input.u |
optional; if you used |
numeric matrix-like object with same dimension/size as input data
.
If inverse = FALSE
it is the Gaussianize matrix / vector;
if TRUE
it is the “DeGaussianized” matrix / vector.
The numeric parameters of mean, scale, and skewness/heavy-tail parameters
that were used in the Gaussianizing transformation are returned as
attributes of the output matrix: 'Gaussianized:mu'
,
'Gaussianized:sigma'
, and for
type = "h": |
|
type = "hh": |
|
type = "s": |
|
They can also be returned as a separate matrix using return.tau.mat =
TRUE
. In this case Gaussianize
returns a list with elements:
input |
Gaussianized input data |
tau.mat |
matrix
with |
# Univariate example
set.seed(20)
y1 <- rcauchy(n = 100)
out <- Gaussianize(y1, return.tau.mat = TRUE)
x1 <- get_input(y1, c(out$tau.mat[, 1])) # same as out$input
test_normality(out$input) # Gaussianized a Cauchy!
kStartFrom <- 20
y.cum.avg <- (cumsum(y1)/seq_along(y1))[-seq_len(kStartFrom)]
x.cum.avg <- (cumsum(x1)/seq_along(x1))[-seq_len(kStartFrom)]
plot(c((kStartFrom + 1): length(y1)), y.cum.avg, type="l" , lwd = 2,
main="CLT in practice", xlab = "n",
ylab="Cumulative sample average",
ylim = range(y.cum.avg, x.cum.avg))
lines(c((kStartFrom+1): length(y1)), x.cum.avg, col=2, lwd=2)
abline(h = 0)
grid()
legend("bottomright", c("Cauchy", "Gaussianize"), col = c(1, 2),
box.lty = 0, lwd = 2, lty = 1)
plot(x1, y1, xlab="Gaussian-like input", ylab = "Cauchy - output")
grid()
## Not run:
# multivariate example
y2 <- 0.5 * y1 + rnorm(length(y1))
YY <- cbind(y1, y2)
plot(YY)
XX <- Gaussianize(YY, type = "hh")
plot(XX)
out <- Gaussianize(YY, type = "h", return.tau.mat = TRUE,
verbose = TRUE, method = "IGMM")
plot(out$input)
out$tau.mat
YY.hat <- Gaussianize(data = out$input, tau.mat = out$tau.mat,
inverse = TRUE)
plot(YY.hat[, 1], YY[, 1])
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.