Description Author(s) See Also Examples
tm.plugin.webmining facilitates the retrieval of textual data through various web feed formats like XML and JSON. Also direct retrieval from HTML is supported. As most (news) feeds only incorporate small fractions of the original text tm.plugin.webmining goes a step further and even retrieves and extracts the text of the original text source. Generally, the retrieval procedure can be described as a two–step process:
In a first step, all relevant meta feeds are retrieved. From these feeds all relevant meta data items are extracted.
In a second step the relevant source content is retrieved.
Using the boilerpipeR
package even the main content of HTML
pages can
be extracted.
Mario Annau mario.annau@gmail
WebCorpus
GoogleFinanceSource
GoogleNewsSource
NYTimesSource
ReutersNewsSource
YahooFinanceSource
YahooInplaySource
YahooNewsSource
1 2 3 4 5 6 7 8 9 10 | ## Not run:
googlefinance <- WebCorpus(GoogleFinanceSource("NASDAQ:MSFT"))
googlenews <- WebCorpus(GoogleNewsSource("Microsoft"))
nytimes <- WebCorpus(NYTimesSource("Microsoft", appid = nytimes_appid))
reutersnews <- WebCorpus(ReutersNewsSource("businessNews"))
yahoofinance <- WebCorpus(YahooFinanceSource("MSFT"))
yahooinplay <- WebCorpus(YahooInplaySource())
yahoonews <- WebCorpus(YahooNewsSource("Microsoft"))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.