Description Usage Arguments Details Value Author(s) See Also
sampled_logs
reads in and parses data from the 1:1000 sampled RequestLogs
on stat1002.
1 |
file |
either the full name of a sampled log file, or the year/month/day of the log file you want, provided as YYYYMMDD |
It does what it says on the tin; pass in a date (formatted as '20140601' or equivalent) and it will retrieve the sampled requestlogs for that day. One caveat worth noting is that the daily dumps are not truncated at precisely the stroke of midnight; for the example, you can expect to see some of the logs from 20140602 and be missing some from the 1st, which will be in 20140531. Slight fuzziness around date ranges may be necessary to get all the traffic you want.
It does not return all the fields from the log file, merely the most useful ones - namely timestamp, ip_address, status_code, url, mime_type, referer, x_forwarded, user_agent, lang and x_analytics.
a data.table containing the useful columns from the sampled logs of the day you asked for.
Oliver Keyes <okeyes@wikimedia.org>
log_strptime
for handling the log timestamp format, parse_uuids
for parsing out
app UUIDs from URLs, log_sieve
for filtering the sampled logs to "pageviews",
and hive_query
for querying the unsampled RequestLogs.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.