subsetHTML: Function subsetHTML

Description Usage Arguments Details Examples

Description

Extracts a coherent subset of code from HTML-code.

Usage

1
2
subsetHTML(html, tag = "div", pattern = NULL, edit = F, save = F,
  plot = F, filename = NULL, trim = T)

Arguments

html

A character element containing HTML-code.

tag

Character element specifying the the subsets of interest. Defaults to "div".

pattern

Regular expression further specifying subsets of interest. If NULL (default) equals tag.

edit

Logical value specifying whether the data.frame should be plotted/edited.

save

Logical value specifying whether the HTML-code should be saved to a csv-file.

plot

Logical value specifying whether to plot the frequency of each HTML-tag found in the html-object.

filename

Character value specifying the filename (if save is TRUE). If NULL (default) as.numeric(Sys.time()) is applied.

trim

Logical value specifying whether to trim text. Defaults to T.

Details

Extracts a coherent subset of code from HTML-code (as returned by JDDM::getHTML, for example).

Examples

1
subsetHTML(getHTML("https://jobs.meinestadt.de/nuernberg/suche?words=Wissenschaftlicher%20Mitarbeiter",tag="div",pattern="class=\"m-resultListEntries__content\""))

AndreasFischer1985/JDDM documentation built on June 19, 2021, 11:02 a.m.