import_corpus: import_corpus

Description Usage Arguments Value Examples

View source: R/corpus.R

Description

Import a corpus from a file.

Usage

1
import_corpus(paths, format, language, textcolumn = 1, encoding = NULL)

Arguments

paths

Path to one of more files, or to a directory (if format="txt") to import.

format

File format: can be "csv", "txt", "factiva", "europresse", "lexisnexis" or "alceste".

language

The language name or code (preferably as IETF language tags, see language) to be used in particular for stopwords and stemming.

textcolumn

When format="csv", the column containing the text, either as a string or as a position

encoding

The character encoding of the file, or NULL to attempt automatic detection.

Value

A Corpus object.

Examples

1
2
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
import_corpus(file, "factiva", language="en")

R.temis documentation built on May 13, 2021, 1:08 a.m.