Description Usage Arguments Details Value See Also Examples
Performs a Gaussian regression with non-constant variances and outputs an 'lmvar' object.
1 2 3 |
y |
Vector of observations |
X_mu |
Model matrix for the expected values μ |
X_sigma |
Model matrix for the logarithms of the standard deviations σ |
intercept_mu |
Boolean, if TRUE a column with the name '(Intercept)' is added to the matrix |
intercept_sigma |
Boolean, if TRUE a column with the name '(Intercept_s)' is added to the matrix |
sigma_min |
Numeric, the minimum value for the standard deviations sigma. Can be a single number which applies to all observations or a vector containing a separate minimum for each observation. |
slvr_options |
A list with options to be passed on to the function
|
control |
A list with further options. The following options can be set.
|
... |
Additional arguments, not used in the current implementation |
The vector y
must be a numeric vector. It can not contain special values like NULL
, NaN
, etc.
If the matrix X_mu
is not specified, the model for the expected values μ will consist of an intercept term only.
The argument intercept_mu
is ignored in this case. Likewise, the model for \log σ will consist of an
intercept term only if X_sigma
is not specified.
In the latter case, the model reduces to a standard linear model.
Both model matrices must be numeric matrices. They can not contain special values like NULL
, NaN
, etc.
The model matrices can be of class matrix
or of a class from the package Matrix
.
They can also be of class numeric
in case they are matrices with one column only.
All columns in the matrix X_mu
must either have a name, or no column has a name at all. It is not allowed that some
colums have a
name while others don't. The same is true for X_sigma
.
If supplied, the column names
must be unique within X_mu
. The same is true for X_sigma
. A column name can be used in both X_mu
and
X_sigma
though.
In case the matrix X_mu
has no columns with a column name, lmvar
gives the names v1
,
v2
etc. to the columns. Likewise, if the matrix X_sigma
has no columns with a column name, lmvar
gives the names v1_s
, v2_s
etc. to the columns.
Matrix X_mu
can not have a column with the name '(Intercept)'.
Matrix X_sigma
can not have a column with the name '(Intercept_s)'. Both names are reserved names.
The argument sigma_min
allows one to run a regression under the constraint of a minimum standard deviation for each
observation. The argument can be a single number, which applies to all observations, or a vector which contains
a separate
minimum for each observation. In the latter case, all vector elements must be zero
(in which case no constraint is applied) or all vector elements must be larger than zero.
The option of a minimum sigma is intended for situations in which an unconstrained regression attempts to make sigma equal to zero for one or more observations. This will cause extreme values for the β_σ and numerical instabilities.Such a situation can be remedied by bounding sigma from below.
In case sigma_min
has a value (or a vector of values) larger than zero and the solve-method is "NR", the
solve method is set to "BFGS". If the argument constraints
is passed on to maxlik
(as a list member of
slvr_options
), it is ignored.
Error estimates and confidence intervals (e.g. such as given by summary.lmvar
and
predict.lmvar
) can be unreliable
if minimum sigmas are specified.
The function maxLik
from the maxLik
package, is used to maximize the (profile) log-likelihood.
maxLik
returns a
termination code which reports whether a maximum was found, see maxLik
.
For the method "NR", a potential problem is reported by a maxLik
return code different from 1
, 2
or 8
.
For other methods, a code different from 0
flags a potential problem.
In case the return code flags a potential problem, the message from maxLik
is output as a warning.
All list elements in slvr_options
are passed on as arguments to maxLik
. The name of the list element is
the argument name, the value of the list element is the argument value.
It is not allowed to pass on the arguments fn
, grad
or hess
. In case the list does not contain an element
method
, it is set to "NR". In case the list does not contain an element start
, an initial estimate for
β_σ is set by lmvar
.
In case one wants to supply an initial estimate for the coefficients, one has to supply an initial estimate
for β_σ. If beta_sigma_initial
is the initial estimate, one passes on the argument
slvr_options = list(start = beta_sigma_initial)
.
The inital estimate beta_sigma_initial
must be a numeric vector. Its length must be as follows.
In case X_sigma
is not specified or has the value NULL
, the initial estimate must be
a single value.
In case X_sigma
is specified and intercept_sigma = TRUE
: the length must be equal to the number
of columns of X_sigma
plus one. The first element of the vector is the initial estimate of the
intercept term for \log σ, the second element is the inital estimate corresponding to the first
column in X_sigma
, the third element is the inital estimate corresponding to the second
column in X_sigma
, etc.
In case X_sigma
is specified and intercept_sigma = FALSE
: the length must be equal to the
number of columns of X_sigma
. The first element of the vector is the initial estimate corresponding to the
first column in X_sigma
, the second element is the inital estimate corresponding to the second
column in X_sigma
, etc.
There is no need to supply an inital estimate for β_μ, see the vignette 'Math' for details.
When maxLik
exits with return code 3 (and corresponding warning 'Last step could not find a value
above the current. Boundary of parameter space?'), it
somehow did not succeed to fit an 'lmvar' model properly. The same is true if the the Hessian if the log-likelihood is not
negative-definite.
In this situation, a proper fit can sometimes be achieved if one drops a few extra columns (sometimes just one column) from
X_sigma
. See the vignette 'Math' for details. The options
control = list(remove_df_sigma_pre = TRUE, remove_df_sigma_post = TRUE)
) do just that.
They attempt to achieve a proper fit by dropping
columns (i.e., reducing the degrees of freedom of the model for σ) if necessary.
The option remove_df_sigma_pre
inspects the model matrices and the response vector before
carrying out the fit, and drops columns from X_sigma
if necessary. Warning: this is an
experimental option which could cause unexpected issues!
The option remove_df_sigma_post = TRUE
attempts to achieve a proper fit in the following two cases.
maxLik
uses the solve-method
"NR" (the default method) or "BFGSR" and exits with return code 3. Note that this not the case when sigma_min
(or one
of its elements) has been set to a value larger than zero because then the method "BFGS" is used.
The option check_hessian
is TRUE
and the Hessian of the log-likelihood is not negative-definite.
I.e., this option drops columns from X_sigma
based on the results of a failed fit.
When check_hessian = TRUE
, it is checked whether the fitted log-likelihood is at a maximum. A warning will be issued if
that is not the case.
The control options mu_full_rank
and sigma_full_rank
are for efficiency purposes. If set to TRUE, the
corresponding model matrices will not be made full-rank because it is assumed they are full-rank already. However,
setting one of these to TRUE while the corresponding model matrix is not full-rank will cause unpredictable and
possibly unnoticed errors. These options should therefore only be changed from their default value with the
utmost care.
See the vignettes
that come with the lmvar
package for more info. Run vignette(package="lmvar")
to list the available vignettes.
An object of class 'lmvar', which is a list. Users are discouraged to access list-members directly. Instead, list-members are to be accessed with the various accessor and utility functions in the package. Exceptions are the following list members for which no accessor functions exist:
y
the vector of observations
X_mu
the model matrix for μ. In general, it differs from the user-supplied X_mu
because
lmvar
adds an intercept-column (unless intercept_mu
is FALSE
) and makes the matrix full-rank.
X_sigma
the model matrix for \log σ. In general, it differs from the user-supplied
X_sigma
because lmvar
adds an intercept-column (unless intercept_sigma
is FALSE
) and makes
the matrix full-rank.
intercept_mu
boolean which tells whether or not an intercept column (Intercept)
has been added to the
model matrix X_μ
intercept_sigma
boolean which tells whether or not an intercept column (Intercept_s)
has been added to the
model matrix X_σ
sigma_min
the value of the argument sigma_min
in the call of lmvar
slvr_options
the value of the argument slvr_options
in the call of lmvar
control
the value of the argument control
in the call of lmvar
slvr_log
the output of maxLik
(the solver routine used to maximize the likelihood). Included only
if the argument slvr_log
has the value TRUE
.
See maxLik
for details about this output.
lmvar_no_fit
to create an object which is like an 'lmvar' object without carrying out a
model fit.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | # As example we use the dataset 'attenu' from the library 'datasets'. The dataset contains
# the response variable 'accel' and two explanatory variables 'mag' and 'dist'.
library(datasets)
# For more info on the data, study the dataset
help("attenu")
# Create the model matrix for the expected values
X = cbind(attenu$mag, attenu$dist)
colnames(X) = c("mag", "dist")
# Create the model matrix for the standard deviations. The standard deviation
# is large for small distances and small for large distances. The use of 'dist'
# as explanatory variable makes the beta for the intercept term blow up.
# Therefore we use '1 / dist' as explanatory variable
X_s = cbind(attenu$mag, 1 / attenu$dist)
colnames(X_s) = c("mag", "dist_inv")
# Carry out the fit
fit = lmvar(attenu$accel, X, X_s)
# Inspect the results
summary(fit)
# Carry out the fit with an inital estimate for beta and ask for
# a report of the solver-routine
beta_sigma_start = c(-4, 0, 0)
fit = lmvar(attenu$accel, X, X_s,
slvr_options = list(start = beta_sigma_start),
control = list(slvr_log = TRUE))
fit$slvr_log
|
attenu package:datasets R Documentation
_T_h_e _J_o_y_n_e_r-_B_o_o_r_e _A_t_t_e_n_u_a_t_i_o_n _D_a_t_a
_D_e_s_c_r_i_p_t_i_o_n:
This data gives peak accelerations measured at various observation
stations for 23 earthquakes in California. The data have been
used by various workers to estimate the attenuating affect of
distance on ground acceleration.
_U_s_a_g_e:
attenu
_F_o_r_m_a_t:
A data frame with 182 observations on 5 variables.
[,1] event numeric Event Number
[,2] mag numeric Moment Magnitude
[,3] station factor Station Number
[,4] dist numeric Station-hypocenter distance (km)
[,5] accel numeric Peak acceleration (g)
_S_o_u_r_c_e:
Joyner, W.B., D.M. Boore and R.D. Porcella (1981). Peak
horizontal acceleration and velocity from strong-motion records
including records from the 1979 Imperial Valley, California
earthquake. USGS Open File report 81-365. Menlo Park, Ca.
_R_e_f_e_r_e_n_c_e_s:
Boore, D. M. and Joyner, W.B.(1982) The empirical prediction of
ground motion, _Bull. Seism. Soc. Am._, *72*, S269-S268.
Bolt, B. A. and Abrahamson, N. A. (1982) New attenuation relations
for peak and expected accelerations of strong ground motion,
_Bull. Seism. Soc. Am._, *72*, 2307-2321.
Bolt B. A. and Abrahamson, N. A. (1983) Reply to W. B. Joyner & D.
M. Boore's "Comments on: New attenuation relations for peak and
expected accelerations for peak and expected accelerations of
strong ground motion", _Bull. Seism. Soc. Am._, *73*, 1481-1483.
Brillinger, D. R. and Preisler, H. K. (1984) An exploratory
analysis of the Joyner-Boore attenuation data, _Bull. Seism. Soc.
Am._, *74*, 1441-1449.
Brillinger, D. R. and Preisler, H. K. (1984) _Further analysis of
the Joyner-Boore attenuation data_. Manuscript.
_E_x_a_m_p_l_e_s:
require(graphics)
## check the data class of the variables
sapply(attenu, data.class)
summary(attenu)
pairs(attenu, main = "attenu data")
coplot(accel ~ dist | as.factor(event), data = attenu, show.given = FALSE)
coplot(log(accel) ~ log(dist) | as.factor(event),
data = attenu, panel = panel.smooth, show.given = FALSE)
Call:
lmvar(y = attenu$accel, X_mu = X, X_sigma = X_s)
Number of observations: 182
Degrees of freedom : 6
Z-scores:
Min 1Q Median 3Q Max
-1.4679 -0.6615 -0.1132 0.5736 2.9969
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.14518878 0.06367414 -2.2802 0.0226 *
mag 0.05436925 0.01137133 4.7813 1.742e-06 ***
dist -0.00129047 0.00014701 -8.7779 < 2.2e-16 ***
(Intercept_s) -4.33049617 0.44747494 -9.6776 < 2.2e-16 ***
mag_s 0.28918112 0.07286834 3.9685 7.231e-05 ***
dist_inv_s 3.14861255 0.23985317 13.1273 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Standard deviations:
Min 1Q Median 3Q Max
0.0634 0.0775 0.0920 0.1095 46.8245
Comparison to model with constant variance (i.e. classical linear model)
Log likelihood-ratio: 31.38197
Additional degrees of freedom: 2
p-value for difference in deviance: 2.35e-14
Maximum Likelihood estimation
Newton-Raphson maximisation, 10 iterations
Return code 1: gradient close to zero
Log-Likelihood: 154.7313 (3 free parameter(s))
Estimate(s): -4.330496 0.2891811 3.148613
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.