Description Usage Arguments Details Construction of custom pseudodata representations Note Author(s) References See Also Examples
Get, test and set the functions that calculate the additive modifications to the responses and totals in binomialresponse GLMs, for the application of biasreduction either via modified scores or via maximum penalized likelihood (where penalization is by Jeffreys invariant prior).
1  modifications(family, pl = FALSE)

family 
a family object of the form 
pl 
logical determining whether the function returned corresponds
to modifications for the penalized maximum likelihood approach or for
the modifiedscores approach to biasreduction. Default value is

The function returned from modifications
accepts the argument p
which are the binomial probabilities and returns a list with
components ar
and at
, which are the linkdependent parts
of the additive modifications to the actual responses and totals,
respectively.
Since the resultant function is used in brglm.fit
, for
efficiency reasons no check is made for p >= 0  p <= 1
, for
length(at) == length(p)
and for length(ap) == length(p)
.
If y* are the pseudoresponses (pseudocounts) and m* are the pseudototals then we call the pair (y*, m*) a pseudodata representation. Both the modifiedscores approach and the maximum penalized likelihood have a common property:
there exists (y*, m*) such that if we replace the actual data (y, m) with (y*, m*) in the expression for the ordinary scores (first derivatives of the likelihood) of a binomialresponse GLM, then we endup either with the modifiedscores or with the derivatives of the penalized likelihood (see Kosmidis, 2007, Chapter 5).
Let μ be the mean of the binomial response y (i.e. μ = m p, where p is the binomial probability corresponding to the count y). Also, let d and d' denote the first and the second derivatives, respectively, of μ with respect to the linear predictor η of the model. All the above are viewed as functions of p. The pseudodata representations have the generic form
pseudoresponse :  y* = y + h a_r(p) 
pseudototals :  m* = m + h a_t(p), 
where h is the leverage corresponding to y. The general expressions for a_r(p) ("r" for "response") and a_t(p) ("t" for "totals") are:
modifiedscores approach
a_r(p) = d'(p)/(2w(p)) 
a_t(p) = 0, 
maximum penalized likelihood approach
a_r(p) = d'(p)/w(p) + p  0.5 
a_t(p) = 0. 
For supplying (y*, m*) in glm.fit
(as is
done by brglm.fit
), an essential requirement for the
pseudodata representation is that it should mimic the behaviour of the
original responses and totals, i.e. 0 ≤ y*
≤ m*. Since h \in [0, 1], the requirement translates to
0 ≤ a_r(p) ≤ a_t(p) for every p \in (0, 1). However,
the above definitions of a_r(p) and a_t(p) do not
necessarily respect this requirement.
On the other hand, the pair (a_r(p), a_t(p)) is not unique in the sense that for a given link function and once the linkspecific structure of the pair has been extrapolated, there is a class of equivalent pairs that can be resulted following only the following two rules:
add and subtract the same quantity from either a_r(p) or a_t(p).
if a quantity is to be moved from a_r(p) to a_t(p) it first has to be divided by p.
For example, in the case of penalized maximum likelihood, the pairs (d'(p)/w(p) + p  0.5 , 0) and (d'(p)/w(p) + p , 0.5/p) are equivalent, in the sense that if the corresponding pseudodata representations are substituted in the ordinary scores both return the same expression.
So, in order to construct a pseudodata representation that corresponds to a userspecified link function and has the property 0 ≤ a_r(p) ≤ a_t(p) for every p \in (0, 1), one merely has to pursue a simple algebraic calculation on the initial pair (a_r(p), a_t(p)) using only the two aforementioned rules until an appropriate pair is resulted. There is always a pair!
Once the pair has been found the following steps should be followed.
For a userspecified link function the user has to write a
modification function with name "br.custom.family" or
"pml.custom.family" for pl=FALSE
or pl=TRUE
,
respectively. The function should take as argument the
probabilities p
and return a list of two vectors with
same length as p
and with names
c("ar", "at")
. The result corresponds to the pair
(a_r(p), a_t(p)).
Check if the custommade modifications function is
appropriate. This can be done via the function
checkModifications
which has arguments
fun
(the function to be tested) and Length
with
default value Length=100
. Length
is to be used
when the userspecified link function takes as argument a
vector of values (e.g. the logexp
link in
?family
). Then the value of Length
should be the
length of that vector.
Put the function in the search patch so that
modifications
can find it.
brglm
can now be used with the custom family as
glm
would be used.
The user could also deviate from modifiedscores and maximum penalized
likelihood and experiment with implemented (or not) links, e.g. probit
,
constructing his own pseudodata representations of the aforementioned
general form. This could be done by changing the link name, e.g. by
probitt < make.link(probit) ;
probitt$name < "probitt"
and then setting a custom br.custom.family
that does
not necessarily depend on the probit
link. Then, brglm
could be used with pl=FALSE
.
A further generalization would be to completely remove the hat value
h in the generic expression of the pseudodata representation
and have general additive modifications that depend on p. To do
this divide both ar
and at
by
pmax(get("hats",parent.frame()),.Machine\$double.eps)
within the
custom modification function (see also Examples).
Ioannis Kosmidis, [email protected]
Kosmidis, I. (2007). Bias reduction in exponential family nonlinear models. PhD Thesis, Department of Statistics, University of Warwick.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84  ## Begin Example 1
## logistic exposure model, following the Example in ?family. See,
## Shaffer, T. 2004. Auk 121(2): 526540.
# Definition of the link function
logexp < function(days = 1) {
linkfun < function(mu) qlogis(mu^(1/days))
linkinv < function(eta) plogis(eta)^days
mu.eta < function(eta) days * plogis(eta)^(days1) *
binomial()$mu.eta(eta)
valideta < function(eta) TRUE
link < paste("logexp(", days, ")", sep="")
structure(list(linkfun = linkfun, linkinv = linkinv,
mu.eta = mu.eta, valideta = valideta, name = link),
class = "linkglm")
}
# Here d(p) = days * p * ( 1  p^(1/days) )
# d'(p) = (days  (days+1) * p^(1/days)) * d(p)
# w(p) = days^2 * p * (1p^(1/days))^2 / (1p)
# Initial modifications, as given from the general expressions above:
br.custom.family < function(p) {
etas < binomial(logexp(.days))$linkfun(p)
# the link function argument `.days' will be detected by lexical
# scoping. So, make sure that the linkfunction inputted arguments
# have unusual names, like `.days' and that
# the link function enters `brglm' as
# `family=binomial(logexp(.days))'.
list(ar = 0.5*(1p)0.5*(1p)*exp(etas)/.days,
at = 0*p/p) # so that to fix the length of at
}
.days <3
# `.days' could be a vector as well but then it should have the same
# length as the number of observations (`length(.days)' should be
# equal to `length(p)'). In this case, `checkModifications' should
# have argument `Length=length(.days)'.
#
# Check:
## Not run: checkModifications(br.custom.family)
# OOOPS error message... the condition is not satisfied
#
# After some trivial algebra using the two allowed operations, we
# get new modifications:
br.custom.family < function(p) {
etas < binomial(logexp(.days))$linkfun(p)
list(ar=0.5*p/p, # so that to fix the length of ar
at=0.5+exp(etas)*(1p)/(2*p*.days))
}
# Check:
checkModifications(br.custom.family)
# It is OK.
# Now,
modifications(binomial(logexp(.days)))
# works.
# Notice that for `.days < 1', `logexp(.days)' is the `logit' link
# model and `a_r=0.5', `a_t=1'.
# In action:
library(MASS)
example(birthwt)
m.glm < glm(formula = low ~ ., family = binomial, data = bwt)
.days < bwt$age
m.glm.logexp < update(m.glm,family=binomial(logexp(.days)))
m.brglm.logexp < brglm(formula = low ~ ., family =
binomial(logexp(.days)), data = bwt)
# The fit for the `logexp' link via maximum likelihood
m.glm.logexp
# and the fit for the `logexp' link via modified scores
m.brglm.logexp
## End Example
## Begin Example 2
## Another possible use of brglm.fit:
## Deviating from bias reducing modifiedscores:
## Add 1/2 to the response of a probit model.
y < c(1,2,3,4)
totals < c(5,5,5,5)
x1 < c(1,0,1,0)
x2 < c(1,1,0,0)
my.probit < make.link("probit")
my.probit$name < "my.probit"
br.custom.family < function(p) {
h < pmax(get("hats",parent.frame()),.Machine$double.eps)
list(ar=0.5/h,at=1/h)
}
m1 < brglm(y/totals~x1+x2,weights=totals,family=binomial(my.probit))
m2 < glm((y+0.5)/(totals+1)~x1+x2,weights=totals+1,family=binomial(probit))
# m1 and m2 should be the same.

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.