Fitting Linear Models with Hyperbolic Errors
Description
Fits linear models with hyperbolic errors. Can be used to carry out linear regression for data exhibiting heavy tails and skewness. Displays the histogram, loghistogram (both with fitted error distribution), QQ plot and residuals vs. fitted values plot for the fitted linear model.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  hyperblm(formula, data, subset, weights, na.action,
xx = FALSE, y = FALSE, contrasts = NULL,
offset, method = "NelderMead",
startMethod = "NelderMead", startStarts = "BN",
paramStart = NULL,
maxiter = 100, tolerance = 0.0001,
controlBFGS = list(maxit = 1000),
controlNM = list(maxit = 10000),
maxitNLM = 10000,
controlCO = list(), silent = TRUE, ...)
## S3 method for class 'hyperblm'
print(x, digits = max(3, getOption("digits")3), ...)
## S3 method for class 'hyperblm'
coef(object, ...)
## S3 method for class 'hyperblm'
plot(x, breaks = "FD",
plotTitles = c("Residuals vs Fitted Values",
"Histogram of residuals",
"LogHistogram of residuals",
"QQ Plot"),
...)

Arguments
formula 
an object of class 
data 
an optional data frame, list or environment (or object
coercible by 
subset 
an optional vector specifying a subset of observations to be used in the fitting process. 
weights 
an optional vector of weights to be used in the fitting
process. Should be 
na.action 
A function which indicates what should happen
when the data contain 
xx, y 
Logicals. If 
contrasts 
An optional list. See the 
offset 
An optional vector. See Details. 
method 
Character. Possible values are 
startMethod 
Character. Possible values are 
startStarts 
Character. Possible values are 
paramStart 
An optional vector. A vector of parameter start values for the optimization routine. See Details. 
maxiter 
Numeric. The maximum number of twostage optimization alternating iterations. See Details. 
tolerance 
Numeric. The twostage optimization convergence ratio. See Details. 
controlBFGS, controlNM 
Lists. Lists of control parameters for

maxitNLM 
Numeric. The maximum number of iterations for the NLM optimizer. 
controlCO 
List. A list of control parameters for

silent 
Logical. If 
x 
An object of class 
object 
An object of class 
breaks 
May be a vector, a single number or a character
string. See 
plotTitles 
Titles to appear above the plots. 
digits 
Numeric. Desired number of digits when the object is printed. 
... 
Passes additional arguments to function

Details
Models for hyperblm
are specified symbolically. A typical
model has the form response ~ terms
where response
is
the (numeric) response vector and terms
is a series of terms
which specifies a linear predictor for response
. A terms
specification of the form first + second
indicates all the
terms in first
together with all the terms in second
with duplicates removed. A specification of the form
first:second
indicates the set of terms obtained by taking the
interactions of all terms in first
with all terms in
second
. The specification first*second
indicates the
cross of first
and second
. This is the same as
first + second + first:second
.
If the formula includes an offset
, this is evaluated and
subtracted from the response.
If response
is a matrix a linear model is fitted separately by
leastsquares to each column of the matrix.
See model.matrix
for some further details. The terms in
the formula will be reordered so that main effects come first,
followed by the interactions, all secondorder, all thirdorder and so
on.
A formula has an implied intercept term. To remove this use either
y ~ x  1
or y ~ 0 + x
. See formula
for
more details of allowed formulae.
NonNULL
weights
can be used to indicate that different
observations have different variances (with the values in
weights
being inversely proportional to the variances); or
equivalently, when the elements of weights
are positive
integers w_i, that each response y_i is the mean of
w_i unitweight observations (including the case that there are
w_i observations equal to y_i and the data have been
summarized).
hyperblm
calls the lower level function
hyperblmFit
for the actual numerical computations.
All of weights
, subset
and offset
are evaluated
in the same way as variables in formula
, that is first in
data
and then in the environment of formula
.
hyperblmFit
uses a twostage alternating optimization
routine. The quality of parameter start values (especially the error
distribution parameters) is crucial to the routine's convergence. The
user can specify the start values via the paramStart
argument,
otherwise the function finds reliable start values by calling the
hyperbFitStand
function.
startMethod
in the argument list is the optimization method for
function hyperbFitStandStart
which finds the start
values for function hyperbFitStand
. It is set to
"NelderMead"
by default due to the robustness of this
optimizer. The "BFGS"
method is also implemented as it is
relatively fast to converge. Since "BFGS"
method is a
quasiNewton method it will not as robust and for some data will not
achieve convergence.
startStarts
is the method used to find the start values for function
hyperbFitStandStart
which includes:
"BN"
A method from BarndorffNielsen (1977) based on estimates of psi and gamma the absolute slopes of the left and right asymptotes to the log density function"FN"
Based on a fitted normal distribution as it is a limit of the hyperbolic distribution"SL"
Based on a fitted skewLaplace distribution for which the log density has the form of two straight line with absolute slopes 1/alpha, 1/beta"MoM"
A method of moment approach"US"
User specified
method
is the method used in stage one of the twostage
alternating optimization routine. As the startMethod
, it is set
to "NelderMead"
by default. Besides "BFGS"
,"nlm"
is also implemented as a alternative. Since BFGS
method is a
quasiNewton method it will not as robust and for some data will not
achieve convergence.
If the maximum of the ratio the change of the individual coefficients
is smaller than tolerance
then the routine assumes convergence,
otherwise if the alternating iteration number exceeds maxiter
with the maximum of the ratio the change of the individual
coefficients larger than tolerance
, the routine is considered
not to have converged.
Value
hyperblm
returns an object of class "hyperblm"
which is a list
containing:
coefficients 
A named vector of regression coefficients. 
distributionParams 
A named vector of fitted hyperbolic error distribution parameters. 
fitted.values 
The fitted values from the model. 
residuals 
The remainder after subtracting fitted values from response. 
mle 
The maximum likelihood value of the model. 
method 
The optimization method for stage one. 
paramStart 
The start values of parameters that the user specified (only where relevant). 
residsParamStart 
The start values of parameters obtained by

call 
The matched call. 
terms 
The 
contrasts 
The contrasts used (only where relevant). 
xlevels 
The levels of the factors used in the fitting (only where relevant). 
offset 
The offset used (only where relevant) 
xNames 
The names of each explanatory variables. If explanatory
variables don't have names then they will be named 
yVec 
The response vector. 
xMatrix 
The explanatory variables matrix. 
iterations 
Number of twostage alternating iterations to convergence. 
convergence 
The convergence code for two stage optimization: 0 is the system converged, 1 is first stage does not converge, 2 is second stage does not converge, 3 is the both stages do not converge. 
breaks 
The cell boundaries found by a call the

Author(s)
David Scott d.scott@auckland.ac.nz, Xinxing Li xli053@aucklanduni.ac.nz
References
BarndorffNielsen, O. (1977) Exponentially decreasing distributions for the logarithm of particle size, Proc. Roy. Soc. Lond., A353, 401–419.
Prause, K. (1999). The generalized hyperbolic models: Estimation, financial derivatives and risk measurement. PhD Thesis, Mathematics Faculty, University of Freiburg.
Trendall, Richard (2005). hypReg: A Function for Fitting a Linear Regression Model in R with Hyperbolic Error. Masters Thesis, Statistics Faculty, University of Auckland.
Paolella, Marc S. (2007). Intermediate Probability: A Computational Approach. pp. 415 Chichester: Wiley.
Scott, David J. and W<fc>rtz, Diethelm and Chalabi, Yohan, (2011). Fitting the Hyperbolic Distribution with R: A Case Study of Optimization Techniques. In preparation.
Stryhn, H. and Christensen, J. (2003). Confidence intervals by the profile likelihood method, with applications in veterinary epidemiology. ISVEE X.
See Also
print.hyperblm
prints the regression result in a table.
coef.hyperblm
obtains the regression coefficients and
error distribution parameters of the fitted model.
summary.hyperblm
obtains a summary output of class
hyperblm
object.
print.summary.hyperblm
prints the summary output in a
table.
plot.hyperblm
obtains a residual vs fitted value plot, a
histgram of residuals with error distribution density curve on top, a
histgram of log residuals with error distribution error density curve
on top and a QQ plot.
hyperblmFit
, optim
, nlm
,
constrOptim
, hist
,
hyperbFitStand
, hyperbFitStandStart
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13  ## stackloss data example
airflow < stackloss[, 1]
temperature < stackloss[, 2]
acid < stackloss[, 3]
stack < stackloss[, 4]
hyperblm.fit < hyperblm(stack ~ airflow + temperature + acid,
tolerance = 1e11)
coef.hyperblm(hyperblm.fit)
plot.hyperblm(hyperblm.fit, breaks = 20)
summary.hyperblm(hyperblm.fit, hessian = FALSE)
