Description Usage Arguments Details Value Author(s) References See Also Examples
Function for obtaining a random sample of records from a very large
table stored in a databased managment system, whitout having to load
in the full table into memory. Targets situations where the full data
does not fit in the computer memory so
usage of the standard sample
function is not possible.
1 | sampleDBMS(dbConn, tbl, percORn, mxPerc=0.5)
|
dbConn |
A data based connection object from the |
tbl |
A string containing the name of the (large) table in the database from which you want draw a random sample of records. |
percORn |
Either the percentage of number of rows of the file or the actual number of rows, the sample should have |
mxPerc |
A maximum threshold for the percentage the sample is allowed to have (defaults to 0.5) |
This function can be used to draw a random sample of records from a very
large table of a database managment system. This is particularly
usefull when you can not afford
to load the full table into memory to use R functions like sample
to
obtain the sample.
The function obtains the sample of rows without actually loading the full data into memory - only the final sample is loaded into main memory.
The function assumes you have alread established and opened a connection to the database and receives as argument the DBI connection object.
A data frame
Luis Torgo ltorgo@dcc.fc.up.pt
Torgo, L. (2016) Data Mining using R: learning with case studies, second edition, Chapman & Hall/CRC (ISBN-13: 978-1482234893).
1 2 3 4 5 6 7 8 9 10 11 | ## A simple example over a table on a MySQL database
## Not run:
library(DBI)
library(RMySQL)
drv <- dbDriver("MySQL") # Loading the MySQL driver
con <- dbConnect(drv,dbname="myDB",
username="myUSER",password="myPASS",
host="localhost")
d <- sampleDBMS(con,"largeTable",10000)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.