README.md

Travis-CI Build
Status Coverage
Status CRAN_Status_Badge

urlscan

Analyze Websites and Resources They Request

Description

The \<urlscan.io> service provides an ‘API’ enabling analysis of websites and the resources they request. Much like the ‘Inspector’ of your browser, \<urlscan.io> will let you take a look at the individual resources that are requested when a site is loaded. Tools are provided to search public \<urlscans.io> scan submissions/results and submit URLs for scanning.

What’s Inside The Tin

The following functions are implemented:

Installation

devtools::install_git("https://git.sr.ht/~hrbrmstr/urlscan")
# or
devtools::install_gitlab("hrbrmstr/urlscan")
# or
devtools::install_github("hrbrmstr/urlscan")

Usage

library(urlscan)
library(tidyverse) # for demos

# current verison
packageVersion("urlscan")
## [1] '0.2.0'
x <- urlscan_search("domain:r-project.org")

as_tibble(x$results$task) %>% 
  bind_cols(as_tibble(x$results$page)) %>% 
  mutate(
    time = anytime::anytime(time),
    id = x$results$`_id`
  ) %>%
  arrange(desc(time)) %>% 
  select(url, country, server, ip, id) -> xdf

ures <- urlscan_result(xdf$id[2], include_dom = TRUE, include_shot = TRUE)

ures
##             URL: https://cran.r-project.org/
##         Scan ID: cdc2b957-548c-447a-a1b2-bebd6a734aec
##       Malicious: FALSE
##      Ad Blocked: FALSE
##     Total Links: 0
## Secure Requests: 9
##    Secure Req %: 100%
magick::image_write(ures$screenshot, "img/shot.png")

urlscan Metrics

| Lang | # Files | (%) | LoC | (%) | Blank lines | (%) | # Lines | (%) | | :--- | -------: | ---: | --: | ---: | ----------: | ---: | -------: | ---: | | R | 10 | 0.91 | 157 | 0.89 | 51 | 0.69 | 130 | 0.76 | | Rmd | 1 | 0.09 | 20 | 0.11 | 23 | 0.31 | 40 | 0.24 |

Code of Conduct

Please note that the ‘urlscan’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.



hrbrmstr/urlscan documentation built on May 15, 2019, 3:30 p.m.