dbglm: Fast generalized linear model in a database

Description Usage Arguments Details Value References

View source: R/dbglm.R

Description

Fast generalized linear model in a database

Usage

1
2

Arguments

...

This argument is required for S3 method extension.

formula

A model formula. It can have interactions but cannot have any transformations except factor

family

Model family

tbl

An object inheriting from tbl. Will typically be a database-backed lazy tbl from the dbplyr package.

sd

Experimental: compute the standard deviation of the score as well as the mean in the update and use it to improve the information matrix estimate

weights

We don't support weights

subset

If you want to analyze a subset, use filter() on the data

Details

For a dataset of size N the subsample is of size N^(5/9). Unless N is large the approximation won't be very good. Also, with small N it's quite likely that, eg, some factor levels will be missing in the subsample.

Value

A list with elements

tildebeta

coefficients from subsample

hatbeta

final estimate

tildeV

variance matrix from subsample

hatV

final estimate

References

http://notstatschat.tumblr.com/post/171570186286/faster-generalised-linear-models-in-largeish-data


dbglm documentation built on June 23, 2021, 9:07 a.m.