Description Usage Arguments Details Value
collect
forces computation of a query and pulls the result
down into a local data frame.
1 2 |
x |
A |
n |
The number of rows to pull down |
batch |
Size of batches in which to fetch the data (passed to
|
quiet |
Whether to print progress updates as the data is being fetched |
... |
Additional arguments passed to |
By default, collect will pull down the 100,000 rows of the result table. If the result table pulls back exactly 100,000 rows, a warning message will be printed.
To pull down all of the rows, n = Inf
can be specified.
When using collect
, make sure to keep in mind how large the data
set you're pulling down is, in regard to both the number of rows as well
as the number of columns.
collect
works by calling hive_query
, so if necessary, you can specify
the batch argument like you would in hive_query to avoid out-of-memory
errors, and you can specify the quiet
argument for whether to print
update messages as the data is being pulled. In contrast to hive_query
,
for collect
the quiet
parameter is TRUE by default.
You can set quiet = FALSE
if you want messages to be printed.
To the extent that you can leave the data in Hive, it is best to do so.
collect
should only be called once you have a data set that has been
filtered, aggregated, and narrowed down to the columns you need,
such as a modeling data set.
A tibble of the result
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.