parse_logs: Parse Log Files

Description Usage Arguments Details Value Examples

View source: R/templates.R

Description

Parse a log file with a provided template and a set of classes

Usage

1
2
3
parse_logs(text, template, classes = list(), ...)

parse_logs_file(text_file, config_file, formatters = list(), ...)

Arguments

text

Character vector; each element a log record

template

Template string

classes

A named list of parsers or regex strings for use within the template string

...

Other arguments passed onto regexpr for matching regular expressions.

text_file

Filename (or readable connection) containing log text

config_file

Filename (or readable connection) containing template file

formatters

Named list of formatter functions for use of formatting classes

Details

'template should only be a template string, such as 'ip ip_address [date access_date]...'.

config_file should be a yaml file or connection with the following fields

text should be a character vector, with each element representing a a log record

text_file should be a file or connection that can be split (with readLines) into a character vector of records

classes should be a named list of parser objects, where names match names of classes in template string, or a similarly named list of regex strings for coercing into parsers

formatters should be a named list of functions, where names match names of classes in template string, for properly formatting fields once they have been captured

Value

A data.frame with each field identified in the template string as a column. For each record in the passed text, the fields were extracted and formatted using the parser objects in default_classes() and classes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Template string with two fields
template <- '{{ip ipAddress}} - [{{date accessDate}}] {{int status }}'

# Two simple log records
logs <- c(
  '192.168.1.10 - [26/Jul/2019:11:41:10 -0500] 200',
  '192.168.1.11 - [26/Jul/2019:11:41:21 -0500] 404'
)

# A formatter for the date field
myFormatters <- list(date = function(x) lubridate::as_datetime(x, format = '%d/%b/%Y:%H:%M:%S %z'))
# A parser class for the date field
date_parser <- parser(
  '[0-3][0-9]\\/[A-Z][a-z]{2}\\/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2}[ ][\\+|\\-][0-9]{4}',
  myFormatters$date,
  'date'
)

# Parse the logs from raw data
parse_logs(logs, template, list(date=date_parser))

# Write the logs and to file and parse
logfile <- tempfile()
templatefile <- tempfile()
writeLines(logs, logfile)
yaml::write_yaml(list(template=template, classes=list(date=date_parser)), templatefile)
parse_logs_file(logfile, templatefile, myFormatters)
file.remove(logfile)
file.remove(templatefile)

tabulog documentation built on Aug. 9, 2019, 5:07 p.m.