r7snr
: Tools to work with Rapid7 scans.io Sonar Data
The following functions are implemented:
snr_parse_response
: Parse Sonar HTTP study encoded data
response.devtools::install_github("hrbrmstr/r7snr")
options(width=120)
This R package will let you work directly with the gzip'd JSON from Rapid7 scans.io Sonar HTTP studies.
Be warned that these are HUGE files and it's very likely this won't fit into memory on any system.
library(r7snr) # current verison packageVersion("r7snr") library(r7snr) library(jsonlite) library(purrr) library(dplyr) library(purrr)
For this example, we'll grab the first 10 records from the 2016-07-05 HTTP study. It may report a message of "gzcat: error writing to output: Broken pipe" that we can ignore
system("curl --silent 'https://scans.io/data/rapid7/sonar.http/20160705-http.gz' | gzcat | head -100", intern=TRUE) %>% map_df(fromJSON) -> http_scan_records
We can take a look a the scan records to find that we have the:
glimpse(http_scan_records)
We can examine one of them:
str(snr_parse_response(http_scan_records$data[10]))
Or, we can turn them all into a data frame:
map_df(1:nrow(http_scan_records), function(i) { x <- http_scan_records[i,] resp <- snr_parse_response(x$data)[[1]] data_frame( vhost=x$vhost, host=x$host, port=x$port, ip=x$ip, status=resp$status, version=resp$version, body=resp$body, headers=list(resp$headers, stringsAsFactors=FALSE) ) }) -> parsed_records
Now we have a fairly usable data frame.
glimpse(parsed_records)
We can now see the server types for what we've read in. Since that is not a
required header, it may not be there so we have to handle NULL
values.
map(parsed_records$headers, "server") %>% map_chr(function(x) ifelse(length(x)>0, x, NA)) %>% table(exclude=FALSE) %>% as.data.frame(stringsAsFactors=FALSE) %>% setNames(c("type", "count")) %>% arrange(desc(count))
If you're going to use this to process the Sonar HTTP studies, it's suggested you will want to
use the jqr
package to filter the downloaded gzip'd JSON file (which does mean learning the jq
filter syntax).
An alternative to using this R package is to follow the example on the Sonar Wiki and generate a HUGE JSON file from the results (NOTE: this will be over 1TB when some of our newer scans start to be posted). Then, use either jqr
or jq
on the command-line to extract fields you want/need and then process the resultant JSON in R.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.