scrapeR: Web Page Content Scraper

View source: R/scrapeR.R

scrapeRR Documentation

Web Page Content Scraper

Description

The scrapeR function fetches and extracts text content from the specified web page. It handles HTTP errors and parses HTML efficiently.

Usage

scrapeR(url)

Arguments

url

A character string specifying the URL of the web page to be scraped.

Details

The function uses tryCatch to handle potential web scraping errors. It fetches the webpage content, checks for HTTP errors, and then parses the HTML content to extract text. The text from different HTML nodes like headings and paragraphs is combined into a single string.

Value

A character string containing the combined text from the specified HTML nodes of the web page. Returns NA if an error occurs or if the page content is not accessible.

Note

This function requires the httr and rvest packages. Ensure that these dependencies are installed and loaded in your R environment.

Author(s)

Mathieu Dubeau, Ph.D.

References

Refer to the rvest package documentation for underlying HTML parsing and extraction methods.

See Also

GET, read_html, html_nodes, html_text

Examples


 url <- "http://www.example.com"
scraped_text <- scrapeR(url)


scrapeR documentation built on Nov. 23, 2023, 5:06 p.m.

Related to scrapeR in scrapeR...