Description Usage Arguments Examples
This function is a wrapper around the optimization and selection routines in the package and can be used for automated calibration of GLM's on semi large datasets.
generalizeToSpecific
is more appropiate for manual R sessions, autoGLM
is more appropiate for situations when calibration takes a long time.
E.g., it allows to run generalizeToSpecific
over a vector of dependent variable classes, and to log and write outputs to disk.
1 2 3 4 5 6 7 8 | autoGLM(data, reclasstable = "default", class = 1,
outputpath = paste(getwd(), "//", sep = ""), modelname = "autoGLM",
tracelevel = 1, actions = c("print", "return"), NAval = -9999,
model = "logit", preselect = "lm", method = "opt.ic", crit.t = 1.64,
crit.p = 0.05, test = "LR", KLIC = "AICc", accuracytolerance = 0.01,
confidence.alternative = 0.9, use.share = 0.25, maxsampleruns = 50,
memorymanagement = TRUE, returnall = FALSE, compress = FALSE,
JIT = TRUE)
|
data |
A dataframe with a categorical response variable in the first column, and covariates in subsequent columns. Typically the product of cbind(Y,X). |
reclasstable |
A table that maps the first column of data into a binary response variable. By default it will be ommitted (the binary response variable will be identical to data[,1]). See also |
class |
The class that should be 1 in the binary response variable, all other classes in the categorical variable will be set to 0. Defaults to 1. See also |
outputpath |
The location on the hard drive where output wwill be written to. Defaults to getwd(). |
modelname |
The name of the model, will be used when writing a weightsfile. Defaults to "autoGLM". See also |
tracelevel |
The amount of information to be printed. Passed on to underlying routines. Defaults to 1 for printing, set to 0 for no printing. |
actions |
Actions to be taken by autoGLM, by default c("print", "return"), may include any combination of c("write", "print", "log", "return"), for writing a geoDMS weightsfile, See also |
NAval |
Optional categorical variable that should be dropped by the reclassification scheme. See also |
model |
Main model type that should be calibrated, either "lm", "probit", or "logit". See also |
preselect |
Optional variable preselection using a first order approximation (linear model) of the logit or probit model, by specifying "lm" (default setting). See also |
method |
The optimization strategy. Either "opt.ic" to optimize using information criteria, "opt.t" for step-wise elimination of insignificant values
(statistically speaking not a sound procedure, but it will provide a parsimonious model that can be usefull as a benchmark), or "opt.h" to optimize by classical hypothesis tests.
defaults to "opt.ic". See also See also |
crit.t |
The t-value indicating significance when using method "opt.t", defaults to 1.64. |
crit.p |
the p-value used by method "opt.h" in the hypothesis tests. Defaults to 0.05. |
test |
The hypothesis test used by "opt.h". Defaults to "LR" for the Likelihood Ratio test. Other options are "F", for an F test for joint significance of insignificant parameters, or "Chisq" for a wald test against the Chi squared distribution. |
KLIC |
The information criterion used by "opt.ic", either "AIC" or "AICc", defaults to the latter. |
accuracytolerance |
When aut of sample and within sample accuracy differ more than accuracytolerance, a warning will be issued, which is also logged when specifying "log" in actions. Defaults to 0.01. |
confidence.alternative |
See also |
use.share |
Share of the data used, See also |
maxsampleruns |
See also |
memorymanagement |
TRUE/FALSE indicating whether garbage collection should be forced regularly when memory usage is high. Defaults to TRUE, recommended setting for large datasets. See also |
returnall |
TRUE, FALSE, or "writedisk" indicating whether all the outputted objects for each class should be returned in an array as produced by lapply, or whether only the final output should be returnd as an object.
Specifying "writedisk" will write the objects containing results of each class as seperate .RDS files, which you can use to restore the output using readRDS(). |
compress, |
passed on to iapply. Defaults to no compression of RDS output, which is the recommended setting if computation time is valued of disk space. Keep in mind that when using large datasets, autoGLM objects can be several gigabytes in size. |
JIT, |
logical indicating whether just-in-time compilation of internal functions should be used. Mainly for historical reasons. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | data(ITdata)
datacorinetable)
results <- autoGLM(data=randomlogit, reclasstable=corinetable, class=0, method ="opt.ic")
# All options:
autoGLM <- function (data, reclasstable = "default", class=1, outputpath=paste(getwd(),"//", sep=""),
modelname="autoGLM", tracelevel=1,
actions = c("print", "return"), NAval = -9999,
model="logit", preselect = "lm", method = "opt.ic", crit.t = 1.64, crit.p =.05,
test = "LR", KLIC = "AICc", accuracytolerance =0.01, confidence.alternative =0.90,
use.share = 0.25, maxsampleruns=50, memorymanagement = TRUE, returnall = FALSE,
compress = FALSE, JIT = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.