Description Usage Format Details Note Examples
An example data.frame
which is used by examples in this user manual
1 |
This data has 104 columns and 2000 rows.
This data set has lots of NA
values in it. By using
as.db.data.frame
, one can put the data set into the
connected database. All the NA
values will be converted into
NULL
values.
The MADlib wrapper functions like madlib.lm
and
link{madlib.glm}
will throw an error if there are NULL
values in the data. So one needs to clean up the data before using the
regression functions supplied by MADlib.
Lazy data loading is enabled in this package. So the user does not
need to explicitly run data(null.data)
to load the data. It will be
loaded whenever it is used.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ## Not run:
## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)
## create a table from the example data.frame "abalone"
delete("null_data", conn.id = cid)
x <- as.db.data.frame(null.data, "null_data", conn.id = cid, verbose = FALSE)
## ERROR, because of NULL values
fit <- madlib.lm(sf_mrtg_pct_assets ~ ris_asset + lncrcd + lnauto +
lnconoth + lnconrp + intmsrfv + lnrenr1a + lnrenr2a +
lnrenr3a, data = x)
## select columns
y <- x[,c("sf_mrtg_pct_assets","ris_asset", "lncrcd","lnauto",
"lnconoth","lnconrp","intmsrfv","lnrenr1a","lnrenr2a",
"lnrenr3a")]
dim(y)
## remove NULL values
for (i in 1:10) y <- y[!is.na(y[i]),]
dim(y)
fit <- madlib.lm(sf_mrtg_pct_assets ~ ., data = y)
fit
db.disconnect(cid, verbose = FALSE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.