Description Usage Arguments Details Value Author(s) References Examples
QAS.func
is used to replace numerical optimization with a quasi-analytical approach for logit models on big data. It returns the coefficients, predicted values and quality criteria for the provided variables.
1 |
frml |
an object of class |
data |
a data frame containing the variables in the model (or object coercible by |
weights |
an optional vector of prior weights to be used in the fitting process. Should be NULL or a numeric vector. In case of NULL, each case is weighted with 1. |
seed |
saving the state of a random process. Should be NULL or a numeric vector. In case of NULL a seed is generated at random. |
tau |
an optional parameter proposed by King and Zeng (2001) which comprises prior information about the fraction of ones in the population of the dependent variable. It has to lie between 0 and 1. |
A typical predictor has the form dependent_Variable '~' independent_Variables.
The dependent_Variable has two categories.
If there is more than one independent_Variable, they can be combined with a '+'.
The data frame must not contain any missing values.
Metric variables have to be of type numeric. All other variables have to be of type integer.
The first variable in the dataset hat to be the dependent variable.
The scale of large numbers has to be reduced e.g. standardization.
An object of class QAS.func is a list containing at least the following components:
coefficients
a vector of coefficients
weights
the working weights
call
the call of the final function within QAS.func
terms
the term object used
model
the model frame
means.for.cat
the cut points of the metric variables for a categorization of the original dataset
categorized.variables
the variables that have been categorized within QAS
seed
used seed for calculations
This method is based on the research work of Stan Lipovetsky and Birgit Stoltenberg.
King, G. & Zeng, L. (2001), Logistic Regression in Rare Events Data, Political Analysis, No. 9 / 2001
Lipovetsky, S. (2014), Analytical closed-form solution for binary logit regression by categorical predictors, Journal of Applied Statistics, No. 42 / 2015
Lipovetsky, S. & Conklin, M. (2014), Best-Worst Scaling in analytical closed-form solution, The Journal of Choice Modelling, No. 10 / 2014
Stoltenberg, B. (2016), Using logit on big data - from iterative methods to analytical solutions, GfK Verein Working Paper Series, No. 3 / 2016
1 2 3 4 5 6 7 8 | # generate Data
y <- as.integer(c(1,0,0,0,1,1,1,0,0,1))
x <- c(15,88,90,60,24,30,26,57,69,18)
z <- as.integer(c(3,2,2,1,3,3,2,1,1,3))
example_data <- data.frame(y,x,z)
# deploy QAS.func-Function
result <- QAS.func(y~x+z, data=example_data, weights=NULL, seed=NULL, tau = NULL)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.