change.encoding: Change character encoding
In stylo: Stylometric Multivariate Analyses

change.encoding

R Documentation

Change character encoding

Description

This function is a wrapper around iconv() that allows for converting character encoding of multiple text files in a corpus folder, preferably into UTF-8.

Usage

change.encoding(corpus.dir = "corpus/", from, to = "utf-8", 
                keep.original = TRUE, output.dir = NULL)

Arguments

`corpus.dir`	path to the folder containing the corpus.
`from`	original character encoding. See the Details section (below) for some hints on how to get the original encoding.
`to`	character encoding to convert into.
`keep.original`	shall the original files be stored?
`output.dir`	folder for the reencoded files.

Details

Stylo works on UTF-8-enconded texts by default. This function allows you to convert your corpus, if not yet encoded in UTF-8. To check the current encoding of text files in your corpus folder, you can use the function check.encoding().

Value

The function saves reencoded text files.

Author(s)

Steffen Pielström

Examples

## Not run: 
# To replace the old versions with the newly encoded, but retain them 
# in another folder:
change.encoding = function(corpus.dir = "~/corpora/example/", 
                           from = "ASCII", to = "utf-8")

# To place the new version in another folder called "utf8/":
change.encoding = function(corpus.dir = "~/corpora/example/",
                           from = "ASCII", 
                           to = "utf-8", 
                           output.dir = "utf8/")
                           
# To simply replace the old version:
change.encoding = function(corpus.dir = "~/corpora/example/", 
                           from = "ASCII", 
                           to = "utf-8",
                           keep.original = FALSE)

## End(Not run)

stylo documentation built on May 29, 2024, 1:37 a.m.