hrbrmstr/htmlunit: Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library

'HtmlUnit' (<>) is a "'GUI'-Less browser for 'Java' programs". It models 'HTML' documents and provides an 'API' that allows one to invoke pages, fill out forms, click links and more just like one does in a "normal" browser. The library has fairly good and constantly improving 'JavaScript' support and is able to work even with quite complex 'AJAX' libraries, simulating 'Chrome', 'Firefox' or 'Internet Explorer' depending on the configuration used. It is typically used for testing purposes or to retrieve information from web sites. Tools are provided to work with this library at a higher level than provided by the exposed 'Java' libraries in the 'htmlunitjars' package.

Getting started

Package details

MaintainerBob Rudis <[email protected]>
LicenseApache License 2.0 | file LICENSE
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
hrbrmstr/htmlunit documentation built on March 3, 2019, 11:39 p.m.