Description Usage Arguments Details Value escaping handling our hadoop/hive setup See Also
this is the "old" hive querying function - it's deprecated as all hell and waiting until Andrew sticks the hive server on a dedicated and more powerful machine.
1 |
query |
a query, or the location of a .hql file containing a query. |
file |
a file name. If this is provided, the results of the query will be written straight there, and a boolean TRUE returned. If not provided (it's NULL by default), the results of the query will be returned as a data.frame |
dt |
Whether to return it as a data.table or not. |
... |
other arguments to pass to read.delim. |
the deprecated hive querying function
a data.frame containing the results of the query, or a boolean TRUE if the user has chosen to write straight to file.
hive_query
works by running the query you provide through the CLI via a system() call.
As a result, single escapes for meaningful characters (such as quotes) within the query will not work:
R will interpret them only as escaping that character /within R/. Double escaping (\\) is thus necessary,
in the same way that it is for regular expressions.
The webrequests
table is documented
on Wikitech, which also provides
a set of example
queries.
When it comes to manipulating the rows with Java before they get to you, Nuria has written a
brief tutorial on loading UDFs
which should help if you want to engage in that; the example provided is a user agent parser, allowing you to
get the equivalent of ua_parse
's output further upstream.
log_strptime
for converting the "dt" column in the webrequests table to POSIXlt,
and mysql_query
and global_query
for querying our MySQL databases.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.