request_foodb_compound_info_crawler: Scrape Compound Information from FooDB

View source: R/14_FOODB.R

request_foodb_compound_info_crawlerR Documentation

Scrape Compound Information from FooDB

Description

This function scrapes compound information from the FooDB website. It visits multiple pages of the compounds section, extracts the data from the HTML tables on each page, and combines the results into a single data frame.

Usage

request_foodb_compound_info_crawler(
  url = "https://foodb.ca/compounds",
  sleep = 1,
  pages = c(1:2838)
)

Arguments

url

A character string specifying the base URL of the FooDB compounds page. Default is '"https://foodb.ca/compounds"'.

sleep

A numeric value indicating the number of seconds to pause between requests to avoid overwhelming the server. Default is '1' second.

pages

A numeric vector indicating which pages to scrape. Default is '1:2838', which covers all pages on the FooDB compounds section.

Details

The function uses the 'purrr::map' function to iterate over the pages and scrape data from each page. The data on each page is extracted using 'rvest::html_table' and combined into a single data frame.

Value

A data frame containing the combined table data from all the specified pages. Each row corresponds to one compound entry from the scraped pages.

Examples

## Not run: 
# Scrape the first 3 pages with a 2-second delay between requests:
data <- request_foodb_compound_info_crawler(
  pages = 1:3,
  sleep = 2
)
head(data)

## End(Not run)


tidymass/massdatabase documentation built on Oct. 18, 2024, 3:56 p.m.