Weighted Mallows Cp

Share:

Description

The Weighted Mallows Cp is evaluated for each submodel.

Usage

1
2
3
4
5
wle.cp(formula, data=list(), model=TRUE, x=FALSE, 
       y=FALSE, boot=30, group, var.full=0, num.sol=1, 
       raf="HD", smooth=0.031, tol=10^(-6), 
       equal=10^(-3), max.iter=500, min.weight=0.5, 
       method="full", alpha=2, contrasts=NULL, verbose=FALSE)

Arguments

formula

a symbolic description of the model to be fit. The details of model specification are given below.

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment which wle.cp is called from.

model, x, y

logicals. If TRUE the corresponding components of the fit (the model frame, the model matrix, the response.)

boot

the number of starting points based on boostrap subsamples to use in the search of the roots.

group

the dimension of the bootstap subsamples. The default value is max(round(size/4),var) where size is the number of observations and var is the number of variables.

var.full

the value of variance to be used in the denominator of the WCP, if 0 the variance estimated from the full model is used.

num.sol

maximum number of roots to be searched.

raf

type of Residual adjustment function to be use:

raf="HD": Hellinger Distance RAF,

raf="NED": Negative Exponential Disparity RAF,

raf="SCHI2": Symmetric Chi-Squared Disparity RAF.

smooth

the value of the smoothing parameter.

tol

the absolute accuracy to be used to achieve convergence of the algorithm.

c

equal

the absolute value for which two roots are considered the same. (This parameter must be greater than tol).

max.iter

maximum number of iterations.

min.weight

see details.

method

see details.

alpha

penalty value.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

verbose

if TRUE warnings are printed.

Details

Models for wle.cp are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first+second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first+second+first:second.

min.weight: the weighted likelihood equation could have more than one solution. These roots appear for particular situation depending on contamination level and type. The presence of multiple roots in the full model can create some problem in the set of weights we should use. Actually, the selection of the root is done by the minimum scale error provided. Since this choice is not always the one would choose, we introduce the min.weight parameter in order to choose only between roots that do not down weight everything. This is not still the optimal solution, and perhaps, in the new release, this part will be change.

method: this parameter, when set to "reduced", allows to use weights based on the reduced model. This is strongly discourage since the robust and asymptotic property of this kind of weighted Cp are not as good as the one based on method="full".

Value

wle.cp returns an object of class "wle.cp".

The function summary is used to obtain and print a summary of the results. The generic accessor functions coefficients and residuals extract coefficients and residuals returned by wle.cp. The object returned by wle.cp are:

wcp

Weighted Mallows Cp for each submodels

coefficients

the parameters estimator, one row vector for each root found and each submodel.

scale

an estimation of the error scale, one value for each root found and each submodel.

residuals

the unweighted residuals from the estimated model, one column vector for each root found and each submodel.

tot.weights

the sum of the weights divide by the number of observations, one value for each root found and each submodel.

weights

the weights associated to each observation, one column vector for each root found and each submodel.

freq

the number of starting points converging to the roots.

call

the match.call().

contrasts
xlevels
terms

the model frame.

model

if model=TRUE a matrix with first column the dependent variable and the remain column the explanatory variables for the full model.

x

if x=TRUE a matrix with the explanatory variables for the full model.

y

if y=TRUE a vector with the dependent variable.

info

not well working yet, if 0 no error occurred.

Author(s)

Claudio Agostinelli

References

Agostinelli, C., (1999). Robust model selection in regression via weighted likelihood methodology, Working Paper n. 1999.4, Department of Statistics, Universiy of Padova.

Agostinelli, C., (2002). Robust model selection in regression via weighted likelihood methodology, Statistics \& Probability Letters, 56, 289-300.

Agostinelli, C., (1998). Inferenza statistica robusta basata sulla funzione di verosimiglianza pesata: alcuni sviluppi, Ph.D Thesis, Department of Statistics, University of Padova.

Agostinelli, C., Markatou, M., (1998). A one-step robust estimator for regression based on the weighted likelihood reweighting scheme, Statistics \& Probability Letters, Vol. 37, n. 4, 341-350.

Agostinelli, C., (1998). Verosimiglianza pesata nel modello di regressione lineare, XXXIX Riunione scientifica della Societ\'a Italiana di Statistica, Sorrento 1998.

See Also

wle.smooth an algorithm to choose the smoothing parameter for normal distribution and normal kernel, wle.lm a function for estimating linear models with normal distribution error and normal kernel.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
library(wle)

x.data <- c(runif(60,20,80),runif(5,73,78))
e.data <- rnorm(65,0,0.6)
y.data <- 8*log(x.data+1)+e.data
y.data[61:65] <- y.data[61:65]-4
z.data <- c(rep(0,60),rep(1,5))

plot(x.data,y.data,xlab="X",ylab="Y")

xx.data <- cbind(x.data,x.data^2,x.data^3,log(x.data+1))
colnames(xx.data) <- c("X","X^2","X^3","log(X+1)")

result <- wle.cp(y.data~xx.data,boot=10,group=10,num.sol=2)

summary(result)

plot(result,num.max=15)

result <- wle.cp(y.data~xx.data+z.data,boot=10,group=10,num.sol=2)

summary(result)

plot(result,num.max=15)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.