mysql: Work with MySQL databases

Description Usage Arguments See Also Examples

Description

Read from, write to, and check data from the MySQL databases and tables in the Wikimedia cluster. Assumes the presence of a validly formatted configuration file.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
mysql_connect(
  database,
  use_x1 = FALSE,
  default_file = NULL,
  hostname = NULL,
  port = NULL
)

mysql_read(query, database = NULL, use_x1 = NULL, con = NULL)

mysql_exists(database, table_name, use_x1 = NULL, con = NULL)

mysql_write(x, database, table_name, use_x1 = NULL, con = NULL, ...)

mysql_close(con)

mysql_disconnect(con)

Arguments

database

name of the database to query; optional if passing a con

use_x1

logical flag; use if querying an extension-related table that is hosted on x1 (e.g. echo_* tables); default FALSE

default_file

name of a config file containing username and password to use when connecting

hostname

name of the machine to connect to, which depends on whether query is used to fetch from the log database (in which case connect to "db1108.eqiad.wmnet") or a MediaWiki ("content") DB, in which case connection_details() is used to return the appropriate shard host name and port based on the stored mapping (use update_shardmap() prior to make sure the latest mapping is used)

query

SQL query

con

MySQL connection returned by mysql_connect(); optional – if not provided, a temporary connection will be opened up

table_name

name of a table to check for the existence of or create, depending on the function

x

a data.frame to write

...

additional arguments to pass to dbWriteTable

See Also

query_hive() or global_query()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## Not run: 
# Connection details (which shard to connect to) are fetched automatically:
mysql_read("SELECT * FROM image LIMIT 100", "commonswiki")
mysql_read("SELECT * FROM wbc_entity_usage LIMIT 100", "wikidatawiki")

# Echo extension tables are on the x1 host:
mysql_read("SELECT *
  FROM echo_event
  LEFT JOIN echo_notification
    ON echo_event.event_id = echo_notification.notification_event
  LIMIT 10;",
"enwiki", use_x1 = TRUE)

# If querying multiple databases in the same shard
# a shared connection may be used:
con <- mysql_connect("frwiki")
results <- purrr::map(
  c("frwiki", "jawiki"),
  mysql_read,
  query = "SELECT...",
  con = con
)
mysql_disconnect(con)

## End(Not run)

wikimedia/wikimedia-discovery-wmf documentation built on Feb. 7, 2021, 12:19 a.m.