data2corpus: Import to corpus

View source: R/data2corpus.R

data2corpusR Documentation

Import to corpus

Description

This function create a corpus based on the kind of data we use as input, i.e. a data frame with six column (i.e. final, id, title, year, authors and abstract)

Usage

data2corpus(data_df, ids = NULL, suffix_name = NULL)

Arguments

data_df

A data frame which is expected to have (at least) the six columns final (integer 0/1 or factror), id, title, authors and abstract (character vectors) and year (integer).

ids

A character vector indicating the name (if any) of the data frame reporting the ids of the documents. If not NULL (default) each document in the final corpus is named with the corresponding IDs.

suffix_name

Character vector representing the "name" of data (to be used as suffix in IDs). Default is NULL.

Value

A VCorpus object in which documents are created merging title and abstract of a given data frame which have textual columns named title and abstract

Examples

data2corpus(liu_4h28)
data2corpus(liu_4h28, 'id')
data2corpus(liu_4h28, 'id', 'Liu')

UBESP-DCTV/costumer documentation built on Feb. 1, 2023, 4:52 a.m.