Description Usage Arguments Value References See Also Examples
These methods implement word hyphenation, based on Liang's algorithm.
For details, please refer to the documentation for the generic
hyphen
method in the sylly
package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ## S4 method for signature 'kRp.text'
hyphen(
words,
hyph.pattern = NULL,
min.length = 4,
rm.hyph = TRUE,
corp.rm.class = "nonpunct",
corp.rm.tag = c(),
quiet = FALSE,
cache = TRUE,
as = "kRp.hyphen",
as.feature = FALSE
)
## S4 method for signature 'kRp.text'
hyphen_df(
words,
hyph.pattern = NULL,
min.length = 4,
rm.hyph = TRUE,
quiet = FALSE,
cache = TRUE
)
## S4 method for signature 'kRp.text'
hyphen_c(
words,
hyph.pattern = NULL,
min.length = 4,
rm.hyph = TRUE,
quiet = FALSE,
cache = TRUE
)
|
words |
Either an object of class |
hyph.pattern |
Either an object of class |
min.length |
Integer,
number of letters a word must have for considering a hyphenation. |
rm.hyph |
Logical, whether appearing hyphens in words should be removed before pattern matching. |
corp.rm.class |
A character vector with word classes which should be ignored. The default value
|
corp.rm.tag |
A character vector with POS tags which should be ignored. Relevant only if |
quiet |
Logical. If |
cache |
Logical. |
as |
A character string defining the class of the object to be returned. Defaults to |
as.feature |
Logical,
whether the output should be just the analysis results or the input object with
the results added as a feature. Use |
An object of class kRp.text
,
kRp.hyphen
,
data.frame
or a numeric vector,
depending on the values of the as
and as.feature
arguments.
Liang, F.M. (1983). Word Hy-phen-a-tion by Com-put-er. Dissertation, Stanford University, Dept. of Computer Science.
[1] http://tug.ctan.org/tex-archive/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
[2] http://www.ctan.org/tex-archive/macros/latex/base/lppl.txt
read.hyph.pat
,
manage.hyph.pat
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
sample_file <- file.path(
path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
)
# call hyphen on a given english word
# "quiet=TRUE" suppresses the progress bar
hyphen(
"interference",
hyph.pattern="en",
quiet=TRUE
)
# call hyphen() on a tokenized text
tokenized.obj <- tokenize(
txt=sample_file,
lang="en"
)
# language definition is defined in the object
# if you call hyphen() without arguments,
# you will get its results directly
hyphen(tokenized.obj)
# alternatively, you can also store those results as a
# feature in the object itself
tokenized.obj <- hyphen(
tokenized.obj,
as.feature=TRUE
)
# results are now part of the object
hasFeature(tokenized.obj)
corpusHyphen(tokenized.obj)
} else {}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.