simfast | R Documentation |
Fitting isotonic generalized single-index regression models via maximum likelihood
with support for estimating response values with predict
and plotting
values with plot
. Also includes support for formula objects, data frames,
and built-in regression families (see Arguments).
simfast(
formula,
data,
intercept = FALSE,
weights = NULL,
offset = NULL,
family = "gaussian",
returnmodel = TRUE,
returndata = TRUE,
method = "stochastic",
multiout = FALSE,
B = 10000,
k = 100,
kappa0 = 100,
tol = 1e-10,
max.iter = 20
)
formula |
an object of class |
data |
optional data frame (or object coercible to a data frame by
|
intercept |
logical value, if |
weights |
optional vector of positive integer weights, with length
|
offset |
numeric vector of model offsets , with length
|
family |
a choice of the error distribution and link function to
be used in the model. This can be a character string naming a family
function, a family function or the result of a call to a family function.
Currently supporting any of |
returnmodel |
logical value that when |
returndata |
logical value that when |
method |
when |
multiout |
logical value, if |
B |
positive integer, sets number of index vectors to try when maximizing the likelihood |
k |
positive integer, algorithmic parameter, more info coming, should be less
than |
kappa0 |
positive integer, initial value of kappa, more info coming |
tol |
numeric, sets tolerance for convergence for |
max.iter |
positive integer limiting number of iterations for
|
For i=1,...,n, let X_i be the d-dimensional covariates and Y_i be the corresponding one-dimensional response. The isotonic single index model is written as
g(mu) = f(a^T x),
where x=(x_1,...,x_d)^T, g is a known link function, a is an d x 1 index vector, and f is a nondecreasing function. The algorithm finds the maximum likelihood estimate of both f and a, assuming that f is an increasing function. Implementaton details can be found in ADD REFs, where theoretical justification of our estimator (i.e. uniform consistency) is also given. For the identifiability of isotonic single index models, we refer to REFs.
an object of class simfast
, with the following structure:
x
if returndata = TRUE
, this is the model matrix used to
fit the model, otherwise it is NULL
.
y
if returndata = TRUE
, this is the response vector used to
fit the model, otherwise it is NULL
.
alphahat
alpha
value estimated by the model fit
yhat
vector of estimated response values
indexvals
vector of estimated single index values, the matrix product
of x
and alphahat
weights
vector of the integer weights used in the model fit
family
the family
function provided to simfast_m
loglik
a numeric value of the log-likelihood at the estimate.
offset
a numeric vector specifying the offset provided in the model formula.
tol
numeric convergence tolerance acheived during fitting with
method = 'stochastic'
. For method = 'exact'
, this is 0
.
iter
number of iterations used to acheieve convergence. For
method = 'exact'
, this is 1
.
method
method
used for fitting the model
model
the model.frame
generated by the formula object
which is used to generate the model.matrix
and
model.response
to pass to simfast_m
intercept
the intercept
rule selected in the argument
multialphahat
returns all estimated alphahat
vectors
if multiout = TRUE
as a matrix if there is more than one, and
as a vector if there is only one.
multiyhat
returns all estimated yhat
vectors
if multiout = TRUE
as a matrix if there is more than one, and
as a vector if there is only one.
Hanna Jankowski: hkj@yorku.ca Konstantinos Ntentes: kntentes@yorku.ca (maintainer)
simfast_m
for providing model matrices instead of a formula, as well
as more examples.
## Load esophageal cancer dataset
esoph <- datasets::esoph
str(esoph) # note that three variables are ordered factors
esoph$ntotal <- esoph$ncases + esoph$ncontrols #use as offset
## subset the data frame for training
set.seed(1) # keep from getting data OOB warning in predict()
nobs <- NROW(esoph)
ind <- sample(1:nobs, size = round(nobs * 0.8))
esophtrain <- esoph[ind, ]
esophtest <- esoph[-ind, ]
## fit a model with formulas, including ordered/regular factors
## and support for offsets. similar syntax to glm()
sfobj <- simfast(ncases ~ offset(log(ntotal)) + tobgp + alcgp + agegp,
data = esophtrain, family = poisson(link = 'log'))
glmobj <- glm(ncases ~ offset(log(ntotal)) + tobgp + alcgp + agegp,
data = esophtrain, family = poisson(link = 'log'))
## Plot the relationship of estimated responses vs. index values
# Not isotonic because of offset
plot(sfobj)
# Y-hats adjusted to same scale
plot(sfobj, offset = FALSE)
## Predictions from simfast and glm rounded to nearest integer
sfpred <- round(predict(sfobj, newdata = esophtest))
# Note that simfast only predicts 'response' values
sfpred
glmpred <- round(predict(glmobj, newdata = esophtest, type = 'response'))
glmpred
## Compare squared residuals
sum((sfpred - esophtest$ncases)^2) #simfast prediction
sum((glmpred - esophtest$ncases)^2) #glm prediction
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.