change.encoding: Change character encoding

View source: R/change.encoding.R

change.encodingR Documentation

Change character encoding

Description

This function is a wrapper around iconv() that allows for converting character encoding of multiple text files in a corpus folder, preferably into UTF-8.

Usage

change.encoding(corpus.dir = "corpus/", from, to = "utf-8", 
                keep.original = TRUE, output.dir = NULL)

Arguments

corpus.dir

path to the folder containing the corpus.

from

original character encoding. See the Details section (below) for some hints on how to get the original encoding.

to

character encoding to convert into.

keep.original

shall the original files be stored?

output.dir

folder for the reencoded files.

Details

Stylo works on UTF-8-enconded texts by default. This function allows you to convert your corpus, if not yet encoded in UTF-8. To check the current encoding of text files in your corpus folder, you can use the function check.encoding().

Value

The function saves reencoded text files.

Author(s)

Steffen Pielström

See Also

check.encoding

Examples

## Not run: 
# To replace the old versions with the newly encoded, but retain them 
# in another folder:
change.encoding = function(corpus.dir = "~/corpora/example/", 
                           from = "ASCII", to = "utf-8")

# To place the new version in another folder called "utf8/":
change.encoding = function(corpus.dir = "~/corpora/example/",
                           from = "ASCII", 
                           to = "utf-8", 
                           output.dir = "utf8/")
                           
# To simply replace the old version:
change.encoding = function(corpus.dir = "~/corpora/example/", 
                           from = "ASCII", 
                           to = "utf-8",
                           keep.original = FALSE)

## End(Not run)

computationalstylistics/stylo documentation built on April 7, 2024, 4:12 p.m.