etl | R Documentation |
etl
objectInitialize an etl
object
etl(x, db = NULL, dir = tempdir(), ...)
## Default S3 method:
etl(x, db = NULL, dir = tempdir(), ...)
## S3 method for class 'etl'
summary(object, ...)
is.etl(object)
## S3 method for class 'etl'
print(x, ...)
x |
the name of the |
db |
a database connection that inherits from |
dir |
a directory to store the raw and processed data files |
... |
arguments passed to methods (currently ignored) |
object |
an object for which a summary is desired. |
A constructor function that instantiates an etl
object.
An etl
object extends a src_dbi
object.
It also has attributes for:
the name of the etl
package corresponding to the data source
the directory where the raw and processed data are stored
the directory where the raw data files are stored
the directory where the processed data files are stored
Just like any src_dbi
object, an etl
object
is a data source backed by an SQL database. However, an etl
object
has additional functionality based on the presumption that the SQL database
will be populated from data files stored on the local hard disk. The ETL functions
documented in etl_create
provide the necessary functionality
for extracting data from the Internet to raw_dir
,
transforming those data
and placing the cleaned up data (usually in CSV format) into load_dir
,
and finally loading the clean data into the SQL database.
For etl
, an object of class etl_x
and
etl
that inherits
from src_dbi
For is.etl
, TRUE
or FALSE
,
depending on whether x
has class etl
etl_create
# Instantiate the etl object
cars <- etl("mtcars")
str(cars)
is.etl(cars)
summary(cars)
## Not run:
# connect to a PostgreSQL server
if (require(RPostgreSQL)) {
db <- src_postgres("mtcars", user = "postgres", host = "localhost")
cars <- etl("mtcars", db)
}
## End(Not run)
# Do it step-by-step
cars %>%
etl_extract() %>%
etl_transform() %>%
etl_load()
src_tbls(cars)
cars %>%
tbl("mtcars") %>%
group_by(cyl) %>%
summarize(N = n(), mean_mpg = mean(mpg))
# Do it all in one step
cars2 <- etl("mtcars")
cars2 %>%
etl_update()
src_tbls(cars2)
# generic summary function provides information about the object
cars <- etl("mtcars")
summary(cars)
cars <- etl("mtcars")
# returns TRUE
is.etl(cars)
# returns FALSE
is.etl("hello world")
cars <- etl("mtcars") %>%
etl_create()
cars
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.