Description Usage Arguments Details Value Cautionary notes Author(s) References See Also Examples
Function selm
fits a l
inear m
odel
with s
kewe
lliptical error term.
The term ‘skewelliptical distribution’ is an abbreviated equivalent
of skewelliptically contoured (SEC) distribution.
The function works for univariate and multivariate response variables.
1 2 3 
formula 
an object of class 
family 
a character string which selects the parametric family
of SEC type assumed for the error term. It must be one of

data 
an optional data frame containing the variables in
the model. If not found in 
weights 
a numeric vector of weights associated to individual
observations. Weights are supposed to represent frequencies, hence must be
nonnegative integers (not all 0) and 
subset 
an optional vector specifying a subset of observations
to be used in the fitting process. It works like the same parameter
in 
na.action 
a function which indicates what should happen
when the data contain 
start 
a vector (in the univariate case) or a list (in the
multivariate case) of initial DP values for searching the
parameter estimates. See ‘Details’ about a choice of
start to be avoided. If 
fixed.param 
a list of assignments of parameter values which must
be kept fixed in the estimation process.
Currently, there only two types of admissible constraint: one is to
set 
method 
a character string which selects the estimation method to be
used for fitting. Currently two options exist: 
penalty 
a character string which denotes the penalty function to be
subtracted to the loglikelihood function, when 
offset 
this can be used to specify an a priori known
component to be included in the linear predictor during fitting. This
should be 
model, x, y 
logicals. If 
... 
optional control parameters, as follows.

By default, selm
fits the selected model by maximum
likelihood estimation (MLE), making use of some numerical
optimization method. Maximization is performed in one
parameterization, usually DP, and then the estimates are mapped to
other parameter sets, CP and pseudoCP;
see dp2cp
for more information on parameterizations.
These parameter transformations are carried out trasparently to the user.
The observed information matrix is used to obtain the estimated variance
matrix of the MLE's and from this the standard errors.
Background information on MLE in the context of SEC
distributions is provided by Azzalini and Capitanio (2014);
see specifically Chapter 3, Sections 4.3, 5.2, 6.2.5–6. For additional
information, see the original research work referenced therein as well as
the sources quoted below.
Although the density functionof SEC distributions are expressed using
DP parameter sets, the methods associated to the objects created
by this function communicate, by default, their outcomes in the CP
parameter set, or its variant form pseudoCP when CP
does not exist; the ‘Note’ at summary.selm
explains why.
A more detailed discussion is provided by Azzalini and Capitanio
(1999, Section 5.2) and ArellanoValle and Azzalini (2008, Section 4),
for the univariate and the multivariate SN case, respectively;
an abriged account is available in Sections 3.1.4–6 and 5.2.3 of
Azzalini and Capitanio (2014). For the ST case, see ArellanoValle
and Azzalini (2013).
There is a known open issue which affects computation of the information
matrix of the multivariate skewnormal distribution when the slant
parameter α approaches the null vector; see p.149 of
Azzalini and Capitanio (2014). Consequently, if a model with
multivariate response is fitted with family="SN"
and the estimate
alpha
of α is at the origin or neary so, the
information matrix and the standard errors are not computed and a
warning message is issued. In this unusual circumstance, a simple
workaround is to refit the model with family="ST"
, which will
work except in remote cases when (i) the estimated degrees of freedom
nu
diverge and (ii) still alpha
remains at the origin.
The optional argument fixed.param=list(alpha=0)
imposes the
constraint α=0 in the estimation process; in the multivariate
case, the expression is interpreted in the sense that all the components
of vector α are zero, which implies symmetry of the
error distribution, irrespectively of the parameterization
subsequently adopted for summaries and diagnostics.
When this restriction is selected, the estimation method cannot be
set to "MPLE"
. Under the constraint α=0,
if family="SN"
, the model is fitted similarly to lm
, except
that here MLE is used for estimation of the covariance matrix.
If family="ST"
or family="SC"
, a symmetric Student's t
or Cauchy distribution is adopted.
Under the constraint α=0, the location parameter xi coincides with the mode and the mean of the distribution, when the latter exists. In addition, when the covariance matrix of a ST distribution exists, it differs from Ω only by a multiplicative factor. Consequently, the summaries of a model of this sort automatically adopt the DP parametrization.
The other possible form of constraint allows to fix the degrees of
freedom when family="ST"
. The two constraints can be combined
writing, for instance, fixed.param=list(alpha=0, nu=6)
.
The constraint nu=1
is equivalent to select family="SC"
.
In practice, an expression of type fixed.param=list(..)
can be
abbreviated to fixed=list(..)
.
Argument start allows to set the initial values, with respect to the DP parameterization, of the numerical optimization. However, there is a specific choice of start to be avoided. When family="SN", do not set the shape parameter alpha exactly at 0, as this would blowup computation of the loglikelihood gradient and the Hessian matrix. This is not due to a software bug, but to a known peculiar behaviour of the loglikelihood function at that specific point. Therefore, in the univariate case for instance, do not set e.g. start=c(12, 21, 0), but set instead something like start=c(12, 21, 0.001). Recall that, if one needs to fit a model forcing 0 asymmetry, typically to compare two loglikelihood functions with/without asymmetry, then the option to use is fixed.param=list(alpha=0).
In some cases, especially for small sample size, the MLE occurs on
the frontier of the parameter space, leading to DP estimates with
abs(alpha)=Inf
or to a similar situation in the multivariate case
or in an alternative parameterization. Such outcome is regared by many as
unsatisfactory; surely it prevents using the observed information matrix to
compute standard errors. This problem motivates the use of maximum penalized
likelihood estimation (MPLE), where the regular loglikelihood
function log(L) is penalized by subtracting an amount
Q, say, increasingly large as α increases.
Hence the function which is maximized at the optimization stage is now
log(L)  Q. If method="MPLE"
and
penalty=NULL
, the default function Qpenalty
is used,
which implements the penalization:
Q(α)= c₁ log(1 + c₂ [α*]²)
where c₁ and c₂ are positive constants, which
depend on the degrees of freedom nu
in the ST
case,
[α*]² = α' cor(Ω) α
and cor(Ω) denotes the correlation matrix
associated to the scale matrix Omega
described in connection with
makeSECdistr
. In the univariate case
cor(Ω)=1,
so that [α*]²=α². Further information
on MPLE and this choice of the penalty function is given in
Section 3.1.8 and p.111 of Azzalini and Capitanio (2014); for a more
detailed account, see Azzalini and ArellanoValle (2013) and references
therein.
It is possible to change the penalty function, to be declared via the
argument penalty
. For instance, if the calling statement includes
penalty="anotherQ"
, the user must have defined
anotherQ < function(alpha_etc, nu = NULL, der = 0)
with the following arguments.
alpha_etc
: in the univariate case, a single value alpha
;
in the multivariate case, a twocomponent list whose first component is
the vector alpha
, the second one is matrix equal to
cov2cor(Omega)
.
nu
: degrees of freedom, only relevant if family="ST"
.
der
: a numeric value which indicates the required order of
derivation; if der=0
(default value), only the penalty Q
needs to be retuned by the function;
if der=1
, attr(Q, "der1")
must represent the
first order derivative of Q
with respect to alpha
; if
der=2
, also attr(Q, "der2")
must be assigned, containing
the second derivative (only required in the univariate case).
This function must return a single numeric value, possibly with required
attributes when is called with der>1
.
Since sn imports functions grad
and
hessian
from package numDeriv, one can rely
on them for numerical evaluation of the derivatives, if they are not
available in an explicit form.
This penalization scheme allows to introduce a prior distribution
π for α by setting Q=log(π),
leading to a maximum a posteriori estimate in the stated sense.
See Qpenalty
for more information and an illustration.
The actual computations are not performed within selm
which only
setsup ingredients for work of selm.fit
and other functions
further below this one. See selm.fit
for more information.
an S4 object of class selm
or mselm
, depending on whether
the response variable of the fitted model is univariate or multivariate;
these objects are described in the selm class
.
The first of these notes applies to the stage preceding the use of selm and related fitting procedures. Before fitting a model of this sort, consider whether you have enough data for this task. In this respect, the passage below taken from p.63 of Azzalini and Capitanio (2014) is relevant.
“Before entering technical aspects, it is advisable to underline a qualitative effect of working with a parametric family which effectively is regulated by moments up to the third order. The implication is that the traditional rule of thumb by which a sample size is small up to ‘about n = 30’, and then starts to become ‘large’, while sensible for a normal population or other twoparameter distribution, is not really appropriate here. To give an indication of a new threshold is especially difficult, because the value of α also has a role here. Under this caveat, numerical experience suggests that ‘about n = 50’ may be a more appropriate guideline in this context.”
The above passage referred to the univariate SN context. In the multivariate case, increase the sample size appropriately, particularly in the ST case. All of this is not to say that one cannot attempt fitting these models with small or moderate sample size. However, one must be aware of the implications and not be surprised of problems, if they appear.
The second cautionary note refers instead to the outcome of a call to
selm and related function, or the lack of it.
The estimates are obtained by numerical optimization methods and, as
usual in similar cases, there is no guarantee that the maximum of the
objective function is achieved. Consideration of model simplicity
and of numerical experience indicate that models with SN error
terms generally produce more reliable results compared to those with
the ST family. Take into account that models involving a
traditional Student's t distribution with unknown degres of freedom
can already be problematic; the presence of the (multivariate) slant parameter
α in the ST family cannot make things any simpler.
Consequently, care must be exercised, especially so if one works with
the (multivariate) ST family.
Consider refitting a model with different starting values and,
in the ST case, building the profile loglikelihood for a range
of ν values; function profile.selm
can be useful here.
Details on the numerical optimization which has produced object obj
can be estracted with slot(obj, "opt.method")
; inspection of this
component can be useful in problematic cases.
Be aware that occasionally optim
and nlminb
declare successful
completion of a regular minimization problem at a point where the Hessian
matrix is not positivedefinite. Two cases of this sort are presented in the
final portion of the examples below.
Adelchi Azzalini
ArellanoValle, R. B., and Azzalini, A. (2008). The centred parametrization for the multivariate skewnormal distribution. J. Multiv. Anal. 99, 1362–1382. Corrigendum: 100 (2009), 816.
ArellanoValle, R. B., and Azzalini, A. (2013, available online 12 June 2011). The centred parametrization and related quantities for the skewt distribution. J. Multiv. Anal. 113, 73–90.
Azzalini, A. and Capitanio, A. (1999). Statistical applications of the multivariate skew normal distribution. J.Roy.Statist.Soc. B 61, 579–602. Fulllength version available at http://arXiv.org/abs/0911.2093
Azzalini, A. and ArellanoValle, R. B. (2013, available online 30 June 2012). Maximum penalized likelihood estimation for skewnormal and skewt distributions. J. Stat. Planning & Inference 143, 419–433.
Azzalini, A. with the collaboration of Capitanio, A. (2014). The SkewNormal and Related Families. Cambridge University Press, IMS Monographs series.
Magnus and Neudecker
selmclass
for classes "selm"
and "mselm"
,
summary.selm
for summaries, plot.selm
for plots,
residuals.selm
for residuals and fitted values
the generic functions coef
, logLik
,
vcov
, profile
, confint
,
predict
the underlying function selm.fit
and those further down
the selection of a penalty function of the loglikelihood,
such as Qpenalty
the function extractSECdistr
to extract the SEC
error distribution from an object returned by selm
the broad underlying logic and a number of ingredients are like in
function lm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72  data(ais)
m1 < selm(log(Fe) ~ BMI + LBM, family="SN", data=ais)
print(m1)
summary(m1)
s< summary(m1, "DP", cov=TRUE, cor=TRUE)
plot(m1)
plot(m1, param.type="DP")
logLik(m1)
coef(m1)
coef(m1, "DP")
var < vcov(m1)
#
m1a < selm(log(Fe) ~ BMI + LBM, family="SN", method="MPLE", data=ais)
m1b < selm(log(Fe) ~ BMI + LBM, family="ST", fixed.param=list(nu=8), data=ais)
#
data(barolo)
attach(barolo)
A75 < (reseller=="A" & volume==75)
logPrice < log(price[A75],10)
m < selm(logPrice ~ 1, family="ST")
summary(m)
plot(m, which=2, col=4, main="Barolo log10(price)")
# cfr Figure 4.7 of Azzalini & Capitanio (2014), p.107
detach(barolo)
#
# examples with multivariate response
#
m3 < selm(cbind(BMI, LBM) ~ WCC + RCC, family="SN", data=ais)
plot(m3, col=2, which=2)
summary(m3, "dp")
coef(m3)
coef(m3, vector=FALSE)
#
data(wines)
m28 < selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST", data=wines)
dp28 < coef(m28, "DP", vector=FALSE)
pcp28 < coef(m28, "pseudoCP", vector=FALSE)
# the next statement takes a little more time than others
plot(m28)
# example of computation and plot of a (relative twice) profile loglikelihood;
# since it takes some time, set a coarse grid of nu values
nu.vector < seq(3, 8, by=0.5)
logL < numeric(length(nu.vector))
for(k in 1:length(nu.vector)) {
m28.f < selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST",
fixed=list(nu=nu.vector[k]), data=wines)
logL[k] < logLik(m28.f)
cat(format(c(nu.vector[k], logL[k])), "\n")
}
plot(nu.vector, 2*(logLmax(logL)), type="b")
ok < which.max(logL)
abline(v=nu.vector[ok], lty=2)
# compare maximum of this curve with MLE of nu in summary(m28, 'dp')
#
m4 < selm(cbind(alcohol,sugar)~1, family="ST", data=wines)
m5 < selm(cbind(alcohol,sugar)~1, family="ST", data=wines, fixed=list(alpha=0))
print(1  pchisq(2*as.numeric(logLik(m4)logLik(m5)), 2)) # test for symmetry
#
# illustrate final passage of 'Warning' section above:
# the execution of the next selm command is known to produce warning messages
# although the optimizer declares successful convergence
m31 < selm(cbind(BMI, LBM)~ Ht + Wt, family="ST", data=ais)
# Warning message...
slot(m31, "opt.method")$convergence # a 0 value indicates success
# the next case is similar
m32 < selm(cbind(BMI, LBM)~ Ht + Wt, family="ST", data=ais, opt.method="BFGS")
# Warning message...
slot(m32, "opt.method")$convergence

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.