View source: R/p-categorical.R
pcat | R Documentation |
The function is trying to merged similar levels of a given factor. Its based on ideas given by Tutz (2013).
pcat(fac, df = NULL, lambda = NULL, method = c("ML", "GAIC"), start = 0.001,
Lp = 0, kappa = 1e-05, iter = 100, c.crit = 1e-04, k = 2)
gamlss.pcat(x, y, w, xeval = NULL, ...)
plotDF(y, factor = NULL, formula = NULL, data, along = seq(0, nlevels(factor)),
kappa = 1e-06, Lp = 0, ...)
plotLambda(y, factor = NULL, formula = NULL, data, along = seq(-2, 2, 0.1),
kappa = 1e-06, Lp = 0, ...)
fac , factor |
a factor to reduce its levels |
df |
the effective degrees of freedom df |
lambda |
the smoothing parameter |
method |
which method is used for the estimation of the smoothing parameter, |
start |
starting value for |
Lp |
The type of penalty required, |
kappa |
a regulation parameters used for the weights in the penalties. |
iter |
the number of internal iteration allowed |
c.crit |
the convergent criterion |
k |
the penalty if |
x |
explanatory factor |
y |
the response or iterative response variable |
w |
iterative weights |
xeval |
indicator whether to predict |
formula |
A formula |
data |
A data frame |
along |
a sequence of values |
... |
for extra variables |
The pcat()
is used for the fitting of the factor. The function shrinks the levels of the categorical factor (not towards the overall mean as the function random()
is doing) but towards each other. This results to a reduction of the number if levels of the factors. Different norms can be used for the shrinkage by specifying the argument Lp
.
The function pcat
reruns a vector endowed with a number of attributes.
The vector itself is used in the construction of the model matrix, while the attributes are needed for the backfitting algorithms additive.fit(). The backfitting is done in gamlss.pcat
.
Note that pcat
itself does no smoothing; it simply sets things up for gamlss.pcat()
to do the smoothing within the backfitting.
Mikis Stasinopoulos, Paul Eilers and Marco Enea
Tutz G. (2013) Regularization and Sparsity in Discrete Structures in the Proceedings of the 29th International Workshop on Statistical Modelling, Volume 1, p 29-42, Gottingen, Germany
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
random
# Simulate data 1
n <- 10 # number of levels
m <- 200 # number of observations
set.seed(2016)
level <- as.factor(floor(runif(m) * n) + 1)
a0 <- rnorm(n)
sigma <- 0.4
mu <- a0[level]
y <- mu + sigma * rnorm(m)
plot(y~level)
points(1:10,a0, col="red")
da1 <- data.frame(y, level)
#------------------
mn <- gamlss(y~1,data=da1 ) # null model
ms <- gamlss(y~level-1, data=da1) # saturated model
m1 <- gamlss(y~pcat(level), data=da1) # calculating lambda ML
AIC(mn, ms, m1)
## Not run:
m11 <- gamlss(y~pcat(level, method="GAIC", k=log(200)), data=da1) # GAIC
AIC(mn, ms, m1, m11)
#gettng the fitted object -----------------------------------------------------
getSmo(m1)
coef(getSmo(m1))
fitted(getSmo(m1))[1:10]
plot(getSmo(m1)) #
# After the fit a new factor is created this factor has the reduced levels
levels(getSmo(m1)$factor)
# -----------------------------------------------------------------------------
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.