null.data: A Data Set with lots of 'NA' values
In pivotalsoftware/PivotalR: A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

Description Usage Format Details Note Examples

An example data.frame which is used by examples in this user manual

1	data(null.data)

This data has 104 columns and 2000 rows.

This data set has lots of NA values in it. By using as.db.data.frame, one can put the data set into the connected database. All the NA values will be converted into NULL values.

The MADlib wrapper functions like madlib.lm and link{madlib.glm} will throw an error if there are NULL values in the data. So one needs to clean up the data before using the regression functions supplied by MADlib.

Lazy data loading is enabled in this package. So the user does not need to explicitly run data(null.data) to load the data. It will be loaded whenever it is used.

## Not run: 


## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)

## create a table from the example data.frame "abalone"
delete("null_data", conn.id = cid)
x <- as.db.data.frame(null.data, "null_data", conn.id = cid, verbose = FALSE)

## ERROR, because of NULL values
fit <- madlib.lm(sf_mrtg_pct_assets ~ ris_asset + lncrcd + lnauto +
                 lnconoth + lnconrp + intmsrfv + lnrenr1a + lnrenr2a +
                 lnrenr3a, data = x)

## select columns
y <- x[,c("sf_mrtg_pct_assets","ris_asset", "lncrcd","lnauto",
          "lnconoth","lnconrp","intmsrfv","lnrenr1a","lnrenr2a",
          "lnrenr3a")]

dim(y)

## remove NULL values
for (i in 1:10) y <- y[!is.na(y[i]),]

dim(y)

fit <- madlib.lm(sf_mrtg_pct_assets ~ ., data = y)

fit

db.disconnect(cid, verbose = FALSE)

## End(Not run)

pivotalsoftware/PivotalR documentation built on March 18, 2021, 9:37 a.m.

pivotalsoftware/PivotalR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pivotalsoftware/PivotalR
A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

null.data: A Data Set with lots of 'NA' values
In pivotalsoftware/PivotalR: A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

Description

Usage

Format

Details

Note

Examples

Related to null.data in pivotalsoftware/PivotalR...

R Package Documentation

Browse R Packages

We want your feedback!

pivotalsoftware/PivotalR A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

null.data: A Data Set with lots of 'NA' values In pivotalsoftware/PivotalR: A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

Description

Usage

Format

Details

Note

Examples

Related to null.data in pivotalsoftware/PivotalR...

R Package Documentation

Browse R Packages

We want your feedback!

pivotalsoftware/PivotalR
A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib

null.data: A Data Set with lots of 'NA' values
In pivotalsoftware/PivotalR: A Fast, Easy-to-Use Tool for Manipulating Tables in Databases and a Wrapper of MADlib