drop.gvf.points | R Documentation |
This function drops observations (alleged outliers) from a fitted GVF model and simultaneously re-fits the model.
drop.gvf.points(x, method = c("pick", "cut"), which.plot = 1:2,
res.type = c("standard", "student"), res.cut = 3,
id.n = 3, labels.id = NULL,
cex.id = 0.75, label.pos = c(4, 2),
cex.caption = 1, col = NULL, drop.col = "red",
...)
x |
An object containing a single fitted GVF model (i.e. of class |
method |
|
which.plot |
|
res.type |
|
res.cut |
A positive value: observations to be dropped will be those with residuals whose absolute
value exceeds |
id.n |
Number of points to be initially labelled in each plot, starting with the most extreme.
Only meaningful if |
labels.id |
Vector of labels, from which the labels for extreme points will be chosen. |
cex.id |
Magnification of point labels. |
label.pos |
Positioning of labels, for the left half and right half of the graph(s) respectively. |
cex.caption |
Controls the size of |
col |
Color to be used for the points in the plot(s). |
drop.col |
Color to be used to visualize and annotate the points to be dropped in the plot(s). |
... |
Other parameters to be passed through to plotting functions. |
This function drops observations (alleged outliers) from a single fitted GVF model and simultaneously re-fits the model.
As a side effect, the function prints on screen the induced change for selected quality measures (see, e.g.,
getR2
).
If method = "pick"
, observations to be dropped are identified interactively by clicking on points of a plot (see ‘Note’).
Argument which.plot
determines the nature of the plot: value 1
is for ‘Observed vs Fitted’,
value 2
is for ‘Residuals vs Fitted’. In the latter case, argument res.type
specifies what
kind of residuals have to be plotted. Argument id.n
specifies how many points have to be labelled
initially, starting with the most extreme in terms of the selected residuals: this applies to both kinds of plots.
If method = "cut"
, observations to be dropped are those with residuals whose absolute value exceeds the
value of argument res.cut
. Again, argument res.type
specifies what kind of residuals have to be used
(and plotted). The points which have been cut will be highlighted on a plot, whose nature is again specified by
argument which.plot
. If which.plot = 1:2
, dropped points will be visualized on both the
‘Observed vs Fitted’ and the ‘Residuals vs Fitted’ graphs simultaneously.
Argument drop.col
controls the color to be used to visualize and annotate in the plot(s) the points to be
dropped. All the other arguments have the same meaning as in function plot.lm
.
An object of the same class as x
(i.e. either gvf.fit
or gvf.fit.gr
), containing the original GVF model re-fitted after dropping (alleged) outliers.
For method = "pick"
, function drop.gvf.points
is only supported on those screen devices for which
function identify
is supported. The identification process can be terminated either by right-clicking the mouse
and selecting 'Stop' from the menu, or from the 'Stop' menu on the graphics window.
Diego Zardetto
GVF.db
to manage ReGenesees archive of registered GVF models, gvf.input
and svystat
to prepare the input for GVF model fitting, fit.gvf
to fit GVF models, plot.gvf.fit
to get diagnostic plots for fitted GVF models, and predictCV
to predict CV values via fitted GVF models.
# Load example data:
data(AF.gvf)
# Inspect available estimates and errors of counts:
str(ee.AF)
# List available registered GVF models:
GVF.db
# Fit example data to registered GVF model number one:
m <- fit.gvf(ee.AF, model=1)
m
summary(m)
##############################################################
# Method 'pick': identify outlier observations to be dropped #
# interactively by clicking on points of a plot. #
##############################################################
# Using the 'Observed vs Fitted' plot (the default):
## Not run:
m1 <- drop.gvf.points(m)
m1
summary(m1)
## End(Not run)
# Using the 'Residuals vs Fitted' plot with standardized
# residuals (the default) and increasing id.n to get more
# labelled points to guide your choices:
## Not run:
m1 <- drop.gvf.points(m, which.plot = 2, id.n = 10)
m1
summary(m1)
## End(Not run)
# The same as above, but with studentized residuals and
# playing with colors:
## Not run:
m1 <- drop.gvf.points(m, which.plot = 2, id.n = 10, res.type = "student",
col = "blue", drop.col = "green", pch = 20)
m1
summary(m1)
## End(Not run)
#############################################################
# Method 'cut': identify outlier observations to be dropped #
# by specifying a threshold for the absolute values of the #
# residuals. #
#############################################################
# Using default threshold on standardized residuals and visualizing
# dropped observations on both 'Observed vs Fitted' and 'Residuals
# vs Fitted' plots:
m1 <- drop.gvf.points(m, method ="cut")
m1
summary(m1)
# Using a custom threshold on studentized residuals and visualizing
# dropped observations on the 'Observed vs Fitted' plot:
m1 <- drop.gvf.points(m, method ="cut", res.type = "student",
res.cut = 2.5, which.plot = 1)
m1
summary(m1)
# The same as above, but visualizing dropped observations on the
# 'Residuals vs Fitted' plot:
m1 <- drop.gvf.points(m, method ="cut", res.type = "student",
res.cut = 2.5, which.plot = 2)
m1
summary(m1)
# You can obviously "cut"/"pick" alleged outliers again from an already
# "cut"/"picked" fitted GVF model:
m2 <- drop.gvf.points(m1, method ="cut", res.type = "student",
res.cut = 2.5, col = "blue", pch = 20)
m2
summary(m2)
#################################################################
# Identifying outlier observations to be dropped from "grouped" #
# GVF fitted models (i.e. x has class 'gvf.fit.gr'). #
#################################################################
# Recall we have at our disposal the following survey design object
# defined on household data:
exdes
# Now use function svystat to prepare "grouped" estimates and errors
# of counts to be fitted separately (here groups are regions):
ee <- svystat(exdes, y=~ind, by=~age5c:marstat:sex, combo=3, group=~regcod)
ee
plot(ee)
# Fit registered GVF model number one separately inside groups:
m <- fit.gvf(ee, model=1)
m
summary(m)
# Now drop alleged outliers separately inside groups:
#####################################################
# Method 'pick': work interactively group by group. #
#####################################################
## Not run:
m1 <- drop.gvf.points(m, which.plot = 2, res.type = "student", col = "blue",
pch = 20)
m1
summary(m1)
## End(Not run)
#########################################################
# Method 'cut': apply the same threshold to all groups. #
#########################################################
m1 <- drop.gvf.points(m, method ="cut", res.type = "student", res.cut = 2)
m1
summary(m1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.