load_txt: Load text files.

Description Usage Arguments Value

Description

Loads text files and puts them into a list, one file per list element, with the file name as the corresponding list element name. This function only loads files with the extension .txt; files with other extensions in the directory are ignored. The output of this function can be passed to concord() and ngram_freq().

Usage

1
2
load_txt(pathway, encoding = "UTF-8", comment_char = NULL,
  recursive = FALSE)

Arguments

pathway

Pathway to the directory with .txt files or the pathway to a specific .txt file, for example, pathway = "/pathway/to/directory/" or pathway = "/pathway/to/file_name.txt", where "/pathway/to/directory" and "pathway/to/file_name.txt" are replaced by the actual pathway on a user's machine. If the .txt files to be loaded are in the working directory of the R session, the user can simply specify load_txt(getwd()).

encoding

Specifies the type of encoding of the .txt files, with the options "UTF-8" (the default) and "latin1".

comment_char

Specifies the comment character, if any, that is at the beginning of lines that should be ignored when loading the files. If comment_char = NULL (the default), all lines are loaded. This argument can be used to eliminate headers from corpus files before using ngram_freq() or concord().

recursive

Specifies whether the function should recursively search for .txt files, beginning at the directory given in pathway. This argument is ignored if pathway points to a specific .txt file.

Value

Returns a list with each .txt file in a separate list element, with the .txt file name as the corresponding list element name.


ekbrown/corpling documentation built on May 16, 2019, 2:24 a.m.