WebSource: Read Web Content and respective Link Content from feedurls.

Description Usage Arguments Value Author(s)

View source: R/source.R

Description

WebSource is derived from Source. In addition to calling the base Source constructor function it also retrieves the specified feedurls and pre–parses the content with the parser function. The fields $Content, $Feedurls $Parser and $CurlOpts are finally added to the Source object.

Usage

1
2
3
4
5
WebSource(feedurls, class = "WebXMLSource", reader, parser,
  encoding = "UTF-8", curlOpts = curlOptions(followlocation = TRUE,
  maxconnects = 5, maxredirs = 20, timeout = 30, connecttimeout = 30,
  ssl.verifyhost = FALSE, ssl.verifypeer = FALSE), postFUN = NULL,
  retrieveFeedURL = TRUE, ...)

Arguments

feedurls

urls from feeds to be retrieved

class

class label to be assigned to Source object, defaults to "WebXMLSource"

reader

function to be used to read content, see also readWeb

parser

function to be used to split feed content into chunks, returns list of content elements

encoding

specifies default encoding, defaults to 'UTF-8'

curlOpts

a named list or CURLOptions object identifying the curl options for the handle. Type listCurlOptions() for all Curl options available.

postFUN

function saved in WebSource object and called to retrieve full text content from feed urls

retrieveFeedURL

logical; Specify if feedurls should be downloaded first.

...

additional parameters passed to WebSource object/structure

Value

WebSource

Author(s)

Mario Annau


mannau/tm.plugin.webmining documentation built on May 21, 2019, 11:24 a.m.