# bigReg: Generalized Linear Models (GLM) for Large Data Sets

Allows the user to carry out GLM on very large data sets. Data can be created using the data_frame() function and appended to the object with object$append(data); data_frame and data_matrix objects are available that allow the user to store large data on disk. The data is stored as doubles in binary format and any character columns are transformed to factors and then stored as numeric (binary) data while a look-up table is stored in a separate .meta_data file in the same folder. The data is stored in blocks and GLM regression algorithm is modified and carries out a MapReduce- like algorithm to fit the model. The functions bglm(), and summary() and bglm_predict() are available for creating and post-processing of models. The library requires Armadillo installed on your system. It probably won't function on windows since multi-core processing is done using mclapply() which forks R on Unix/Linux type operating systems.

- Author
- Chibisi Chima-Okereke <chibisi@active-analytics.com>
- Date of publication
- 2016-07-25 19:16:58
- Maintainer
- Chibisi Chima-Okereke <chibisi@active-analytics.com>
- License
- GPL (>= 2)
- Version
- 0.1.2

## Man pages

- asInteger
- converts numeric vector to integer
- bglm
- Function to carry out generalized linear regression on a...
- bglm_predict
- predict function for bglm object
- binomial_
- binomial family function
- blm
- Function to carry out linear regression on a data_frame data...
- CreateFactor
- creates factor from numeric vector and character vector as...
- data_frame
- function to create a data_frame object
- data_matrix
- function to create a data_frame object
- family_
- family function
- Gamma_
- Gamma family function
- gaussian_
- gaussian family function
- inverse.gaussian_
- inverse.gaussian family function
- load_data_frame
- function to load data_frame object
- load_data_matrix
- function to load data_frame object
- myIn
- finds whether x is in y
- mySeq
- mySeq function to sequence integers
- plasma
- plasma data from the HSAUR package
- poisson_
- poisson family function
- print.bglm
- print function for the bglm object
- print.blm
- print function for the blm object
- print.data_frame
- print function for a data_frame
- print.data_matrix
- print function for a data_matrix
- print.summary.bglm
- Function to print the summary object from the bglm object
- print.summary.blm
- Function to print the summary object from the blm object
- process_bglm_block
- Function to print the summary object from the blm object
- quasi_
- quasi family function
- quasibinomial_
- quasibinomial family function
- quasipoisson_
- quasipoisson family function
- r_bind
- row binding for benchmarking ...
- read_df_block
- read data frame block from file
- read_df_blocks
- read multiple blocks of data frames from file
- read_matrix_block
- read matrix block from file
- read_matrix_blocks
- read matrix blocks from file
- readNumericVector
- reads numeric vector to file
- sum_bglm_block
- The reduction function for the algorithm
- summary.bglm
- summary function for the bglm object
- summary.blm
- summary function for the blm object
- SVD
- Singular value decomposition of the aggregated list from...
- write_numeric_vector
- writes numeric vector to file
- writeNumericVector
- writes numeric vector to file
- XWXMatrix
- Calculation of iterative regression components
- XWXMatrixW
- Calculation of iterative regression components

