readTextmeta: Read Corpora as CSV

Description Usage Arguments Value

View source: R/readTextmeta.R

Description

Reads CSV-files and seperates the text and meta data. The result is a textmeta object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
readTextmeta(
  path,
  file,
  cols,
  dateFormat = "%Y-%m-%d",
  idCol = "id",
  dateCol = "date",
  titleCol = "title",
  textCol = "text",
  encoding = "UTF-8",
  xmlAction = TRUE,
  duplicateAction = TRUE
)

readTextmeta.df(
  df,
  cols = colnames(df),
  dateFormat = "%Y-%m-%d",
  idCol = "id",
  dateCol = "date",
  titleCol = "title",
  textCol = "text",
  xmlAction = TRUE,
  duplicateAction = TRUE
)

Arguments

path

character/data.frame string with path where the data files are OR parameter df for readTextmeta.df

file

character string with names of the CSV files

cols

character vector with columns which should be kept

dateFormat

character string with the date format in the files for as.Date

idCol

character string with column name of the IDs

dateCol

character string with column name of the Dates

titleCol

character string with column name of the Titles

textCol

character string with column name of the Texts

encoding

character string with encoding specification of the files

xmlAction

logical whether all columns of the CSV should be handled with removeXML

duplicateAction

logical whether deleteAndRenameDuplicates should be applied to the created textmeta object

df

data.frame table which should be transformed to a textmeta object

Value

textmeta object


tosca documentation built on Oct. 28, 2021, 5:07 p.m.