query_hive: Query Hadoop cluster with Hive
In wikimedia/wikimedia-discovery-wmf: R Tools For Wikimedia Foundation's Analysts And Data Scientists

Description Usage Arguments Value Escaping Handling our hadoop/hive setup See Also Examples

Queries Hive

query_hive(
  query,
  heap_size = 1024,
  use_nice = TRUE,
  use_ionice = TRUE,
  use_beeline = FALSE,
  debug = FALSE
)

`query`	a Hive query
`heap_size`	`HADOOP_HEAPSIZE`; default is 1024 (alt: 2048 or 4096)
`use_nice`	Whether to use `nice` for less greedy CPU usage in a multi-user environment. The default is `TRUE`.
`use_ionice`	Whether to use `ionice` for less greedy I/O in a multi-user environment. The default is `TRUE`.
`use_beeline`	Whether to use `beeline` to connect with Hive instead of `hive`. The default is `FALSE`.
`debug`	Whether to print the query and any messages/info which could be useful for debugging.

A data.frame containing the results of the query, or a TRUE if the user has chosen to write straight to file.

hive_query works by running the query you provide through the CLI via a system() call. As a result, single escapes for meaningful characters (such as quotes) within the query will not work: R will interpret them only as escaping that character /within R/. Double escaping (\\) is thus necessary, in the same way that it is for regular expressions.

The webrequests table is documented on Wikitech, which also provides a set of example queries. When it comes to manipulating the rows with Java before they get to you, Nuria has written a brief tutorial on loading UDFs which should help if you want to engage in that.

lubridate::ymd_hms() for converting the "dt" column in the webrequests table to proper datetime, and mysql_read() and global_query() for querying our MySQL databases

## Not run: 
query_hive("USE wmf; DESCRIBE webrequest;")

## End(Not run)

wikimedia/wikimedia-discovery-wmf documentation built on Feb. 7, 2021, 12:19 a.m.

wikimedia/wikimedia-discovery-wmf index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

wikimedia/wikimedia-discovery-wmf
R Tools For Wikimedia Foundation's Analysts And Data Scientists

query_hive: Query Hadoop cluster with Hive
In wikimedia/wikimedia-discovery-wmf: R Tools For Wikimedia Foundation's Analysts And Data Scientists

Description

Usage

Arguments

Value

Escaping

Handling our hadoop/hive setup

See Also

Examples

Related to query_hive in wikimedia/wikimedia-discovery-wmf...

R Package Documentation

Browse R Packages

We want your feedback!

wikimedia/wikimedia-discovery-wmf R Tools For Wikimedia Foundation's Analysts And Data Scientists

query_hive: Query Hadoop cluster with Hive In wikimedia/wikimedia-discovery-wmf: R Tools For Wikimedia Foundation's Analysts And Data Scientists

Description

Usage

Arguments

Value

Escaping

Handling our hadoop/hive setup

See Also

Examples

Related to query_hive in wikimedia/wikimedia-discovery-wmf...

R Package Documentation

Browse R Packages

We want your feedback!

wikimedia/wikimedia-discovery-wmf
R Tools For Wikimedia Foundation's Analysts And Data Scientists

query_hive: Query Hadoop cluster with Hive
In wikimedia/wikimedia-discovery-wmf: R Tools For Wikimedia Foundation's Analysts And Data Scientists