View source: R/read.segments.R
read.segments | R Documentation |
Split texts by word count or specific characters. Input texts directly, or read them in from files.
read.segments(path = ".", segment = NULL, ext = ".txt", subdir = FALSE,
segment.size = -1, bysentence = FALSE, end_in_quotes = TRUE,
preclean = FALSE, text = NULL)
path |
Path to a folder containing files, or a vector of paths to files. If no folders or files are
recognized in |
segment |
Specifies how the text of each file should be segmented. If a character, split at that character; '\n' by default. If a number, texts will be broken into that many segments, each with a roughly equal number of words. |
ext |
The extension of the files you want to read in. '.txt' by default. |
subdir |
Logical; if |
segment.size |
Logical; if specified, |
bysentence |
Logical; if |
end_in_quotes |
Logical; if |
preclean |
Logical; if |
text |
A character vector with text to be split, used in place of |
A data.frame
with columns for file names (input
),
segment number within file (segment
), word count for each segment (WC
), and the text of
each segment (text
).
# split preloaded text
read.segments("split this text into two segments", 2)
## Not run:
# read in all files from the package directory
texts <- read.segments(path.package("lingmatch"), ext = "")
texts[, -4]
# segment .txt files in dir in a few ways:
dir <- "path/to/files"
## into 1 line segments
texts_lines <- read.segments(dir)
## into 5 even segments each
texts_5segs <- read.segments(dir, 5)
## into 50 word segments
texts_50words <- read.segments(dir, segment.size = 50)
## into 1 sentence segments
texts_1sent <- read.segments(dir, segment.size = 1, bysentence = TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.