WebAnalytics-package: Tools for web server log performance reporting

WebAnalytics-packageR Documentation

Tools for web server log performance reporting

Description

The WebAnalytics package is a simple, low-impact way of getting detailed insights into the performance of a web application and of identifying opportunities for remediation. It generates detailed analytical reports on application response time from web server logs.

The objective of the package is to extract the maximum value from web server log data and to use that information to identify problems and potential areas for remediation. It enables you to easily read web server log files; generate histograms, scatter plots and tabular reports of response times, overall and per URL; to generate some diagnostic plots; and to generate a LaTeX document that can then be formatted as a PDF. The package supplies scripts and templates to do that document generation.

Details

Package: WebAnalytics
Type: Package
Date: 2023-10-04
License: GPL 3

This code was used for many years in a performance consulting/toubleshooting context, dealing with systems that were not set up with comprehensive monitoring infrastructure, and sometimes with systems that did have monitoring infrastructure but which did not generate useful measures (percentiles are difficult to calculate on stream data, sometimes too little data is retained for longitudinal analysis), or which rolled up performance metrics to mean values over long intervals, destroying the short term information in the logs. For some systems the diagnostic plots were interesting by themselves.

It is not a debugging tool, it indicates where problems are and where there are behaviours that are unexpected: the tables and histograms identifying multiple code paths that developers may not be aware of, the diagnostic plots indicating contention, the scatter plots indicating short term variations in response time that are indicative of some kind of problem. All these enable potential fixes to be worked on, and once those fixes are developed, enabling direct measurement of the impact using the baselining graphs and tables.

A sample PDF report can be generated in the current directory, with work files saved in under the R tempdir using the following code fragment:

library(WebAnalytics)
filesDir = paste0(tempdir(),"/ex")
configVariableSet("config.workdir", filesDir)
workingDirectoryPopulate(".")
pdfGenerate()

The generated report provides the following:

Response Time Overview

  • Detailed Response Time Percentiles

  • Response Time Change over baseline workload (if a baseline log is supplied and the baseline is read)

  • Request/Response Size Percentile Breakdown

  • Response Times by Time - Scatter Plot

  • Response Time Histogram

  • Request Status by Hour

  • Top Transactions by 95th percentile response time

  • Top Transactions by aggregate response time

  • Top Transactions by error rate

This section addresses questions such as

  • How many static, dynamic and monitoring requests are there in the logs?

  • How much of total system processing time is accounted for by static, dynamic and monitoring requests?

  • How much static, dynamic and monitoring data transfer is there?

  • How many requests per hour are made and in what hours?

  • What are the transactions with the highest 95th percentile response times?

  • What are the transactions that account for the most aggregate wait time in the system?

The 95th percentile and aggregate wait time tables are useful to identify those tramnsactions that could repay some performance optimisation. Anything high in both lists is worth investigating.

Transaction Data for each URL

  • response time percentiles

  • response time scatter plot by time of day

  • response time histogram

  • error rate by hour

  • and variances over a baseline dataset (useful for comparing before and after release performance)

This addresses questions such as

  • What is the clock time distribution of requests and response times for a URL?

  • How many distinct groups of response times are there for a URL?

  • How have these metrics changed relative to a baseline set of log data?

Browser Mix Percentages

  • Browser family percentiles

  • Browser family and version percentiles

These percentages are useful for identifying which browsers and versions need to be tested.

Diagnostic Charts

  • 95th percentile response time by request rate

  • Dynamic Content Response time by degree of request concurrency

  • Static Content Redirect time by degree of request concurrency

  • Static Content (successful requests) time by request concurrency

  • Static Content (successful requests) time by outbound data rate

These plots mostly adddress the scalability of the system.

Percentile Comparison of transaction mix with baseline reporting period

  • Input Data stats

  • Transaction Counts and percentages by URL

  • Transaction Waits and percentages by URL

These are primarily used for callibrating test workloads to ensure that the transaction mix is similar to the production workload, or the planned workload.

Server and Session Analysis

  • Server Request Counts

  • Session Request Counts

  • Unique Sessions by Hour

A function workingDirectoryPopulate is provided to populate a working directory with all needed supporting files and a sample R report file which can be edited as needed. The working directory contains:

  • sampleRfile.R - sample report template

  • sample.config - configuration file for the report

  • logo.eps - a 2cm by 2cm logo graphic (a placeholder) in EPS format

  • makerpt.ps1 - PowerShell script to run the report and process the output with xelatex

  • makerpt.sh - bash script to run the report and process the output with xelatex

  • WebAnalytics.cls - the report LaTeX class

An R function, workingDirectoryPopulate will place copies of all necessary files in a directory, already configured to generate a sample PDF report from test data supplied with the package.

The supplied configuration file sample.config read by the report script provides enough flexibility for most purposes. Switches are provided to turn on or off different sections of the report. Edit the config file, sample.config to update the list of column names and data types (documented in logFileRead or use the IIS log utility function logFileFieldsGetIIS. The directory structure that it assumes is that there is a data directory identified in config.current.dataDir with multiple log directories under it (config.current.dirNames). This applies to both current data and the baseline log. The default behaviour of the script is to read the lexically last file name with a .log extension from each log directory and it checks that the log names are the same in each directory. This is consistent with a structure in which logs are regularly copied into a log directory for processing or where some pre-processing is required, for example where the log is being written with a varying number of fields as a result of sme other configuration by network or admin teams. Additional functions are provided to select all or some files: logFileNamesGet, logFileNamesGetAll, logFileNamesGetLast, and logFileNamesGetLastMatching and these can be substituted in the report template as needed.

There are multiple ways to run xelatex on the generated template. A bash script and a Powershell script are provided to do that if you have LaTeX already installed. Run the sample script and config file that are created in that directory using the command . ./makerpt.sh sample or powershell -f makerpt.ps1 sample to generate a sample PDF from the test data supplied as part of the package. If you do not have a LaTeX installation, The R package tinytex can be used to install LaTeX and a function pdfGenerate is provided in this package to do the PDF generation from within R.

The package uses the CRAN package brew to produce the LaTeX source from a Brew template and comes with its own LaTeX document class and a blank logo graphic, both of which can be tailored as needed.

The generated LaTeX document has been tested with xelatex and is known not to work with plain LaTeX because of font issues.

The package requires Apache or IIS log files to contain elapsed times in addition to timestamps, HTTP verbs, HTTP response codes and URLs. In Apache the elapsed time is provided by the %d or %D format specifier in a log format specification string. In IIS the time-taken field must be added to the log format. If supplied, the request and response sizes are also used by the report. For WebSphere applications, adding the JSESSIONID cookie to the log enables server-level session statistics (the server ID is parsed out of the WebSphere JSESSIONID cookie value, if the JSESSIONID cookie is not of the format serverID:sessionID the server distribution will be represented by a single server. To get session-level information without the cookie being present, it might be possible to use the client IP (depending on the structure of the network), in which case, adding

b$jsessionid = b$userip
b$serverid = 1

to the config.fix.data function, in the sample configuration file, will provide some useful information.

The config.fix.data function is used to classify URLS as dynamic (the URL is retained), static or monitoring. The script depends on the literals that are used and the function must use those literals to identify Static and monitoring requests.

Author(s)

Maintainer: Greg Hunt greg@firmansyah.com

Examples


## Not run: 
# find the *.log files in the directory 
logFileName = logFileNamesGetLast(dataDirectory=datd, 
  directoryNames=c(".", "."), 
  fileNamePattern="*[.]log")[[1]]

# get the columns from an IIS log
cols = logFileFieldsGetIIS(logFileName)

# read the log file as the current data 
logdf = logFileListRead(logFileName, 
          readFunction=logFileRead, 
          columnList=cols)
          
# read a baseline data set 
logbasedf = logFileListRead(logFileName, 
          readFunction=logFileRead, 
          columnList=cols)
  
# compare percentage counts and delays between 
#   baseline and current, useful for load test callibration 
plotWriteFilenameToLaTexFile(
  plotSaveGG(
    # convert elapsed time to seconds
    percentileBaselinePrint(logdf$elapsed/1000, 
              logbasedf$elapsed/1000,    
              columnNames = c("Delta", "Current", "Baseline", "Percentile"))
    , "xxx")
    )

## End(Not run)


WebAnalytics documentation built on Oct. 4, 2023, 5:07 p.m.