In kevinushey/rex: Friendly Regular Expressions

Parsing server log files is a common task in server administration. 1,2 Historically R would not be well suited to this and it would be better performed using a scripting language such as perl. Rex, however, makes this easy to do and allows you to perform both the data cleaning and analysis in R!

Common server logs consist of space separated fields.

198.214.42.14 - - [21/Jul/1995:14:31:46 -0400] "GET /images/ HTTP/1.0" 200 17688

lahal.ksc.nasa.gov - - [24/Jul/1995:12:42:40 -0400] "GET /images/USA-logosmall.gif HTTP/1.0" 200 234

The logs used in this vignette come from two months of all HTTP requests to the NASA Kennedy Space Center WWW server in Florida and are freely available for use. 3

library(rex)
library(dplyr)
library(knitr)
library(ggplot2)
library(magrittr)

parsed <- scan("NASA.txt", what = "character", sep = "\n") %>%
  re_matches(
    rex(

      # Get the time of the request
      "[",
        capture(name = "time",
          except_any_of("]")
        ),
      "]",

      space, double_quote, "GET", space,

      # Get the filetype of the request if requesting a file
      maybe(
        non_spaces, ".",
        capture(name = "filetype",
          except_some_of(space, ".", "?", double_quote)
        )
      )
    )
  ) %>%
  mutate(filetype = tolower(filetype),
         time = as.POSIXct(time, format="%d/%b/%Y:%H:%M:%S %z"))

This gives us a nicely formatted data frame of the time and filetypes of the requests.

kable(head(parsed, n = 10))

We can also easily generate a histogram of the filetypes, or a plot of requests over time.

ggplot(na.omit(parsed)) + stat_count(aes(x=filetype))
ggplot(na.omit(parsed)) + geom_histogram(aes(x=time)) + ggtitle("Requests over time")

kevinushey/rex documentation built on March 14, 2024, 4:17 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kevinushey/rex
Friendly Regular Expressions

In kevinushey/rex: Friendly Regular Expressions

R Package Documentation

Browse R Packages

We want your feedback!

kevinushey/rex Friendly Regular Expressions

In kevinushey/rex: Friendly Regular Expressions

R Package Documentation

Browse R Packages

We want your feedback!

kevinushey/rex
Friendly Regular Expressions