till-tietz/parsel: Parallel Dynamic Web-Scraping Using 'RSelenium'

A system to increase the efficiency of dynamic web-scraping with 'RSelenium' by leveraging parallel processing. You provide a function wrapper for your 'RSelenium' scraping routine with a set of inputs, and 'parsel' runs it in several browser instances. Chunked input processing as well as error catching and logging ensures seamless execution and minimal data loss, even when unforeseen 'RSelenium' errors occur. You can additionally build safe scraping functions with minimal coding by utilizing constructor functions that act as wrappers around 'RSelenium' methods.

Getting started

Package details

Maintainer
LicenseMIT + file LICENSE
Version0.3.0
URL https://github.com/till-tietz/parsel
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("till-tietz/parsel")
till-tietz/parsel documentation built on Jan. 4, 2024, 8:55 p.m.