scancn: Read a Text File by Auto-Detecting Encoding
In chinese.misc: Miscellaneous Tools for Chinese Text Mining and More

Description Usage Arguments Details Value Examples

The function reads a text file and tries to detect file encoding. If you have Chinese files from different sources and cannot give them a single encoding, just let this function detect and read them. The function can save you much time on dealing with unrecognizable characters.

1	scancn(x, enc = "auto", collapse = " ")

`x`	a length 1 character specifying filename.
`enc`	a length 1 character of file encoding specified by user. The default is "auto", which means let the function detect encoding.
`collapse`	this is used by the `collapse` argument of `paste` in order to link characters together. Default is " " (three spaces).

The function calls scan(x, what = "character", ...) and auto-detects file encoding. Sometimes a Chinese file is encoded in "UTF-8", but what is actually read is a "?". When this happens, the function reads it twice and uses stringi::stri_encode to convert it. If invalid inputs are found in the content, the file will also be read twice.

The function always returns a length 1 character. If the return of scan is a vector with length larger than 1, elements will be pasted together with three spaces or other specified symbols.

It will return a " " (one space) when all the elements of the vector are NA. If not all elements are NA, those equal to NA will be changed to "" (a size 0 string) before being pasted together.

a length 1 character of text.

1
2
3

# No Chinese is allowed, so try an English file
x <- file.path(find.package("base"), "CITATION")
scancn(x)

chinese.misc documentation built on Sept. 13, 2020, 5:13 p.m.

chinese.misc index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

chinese.misc
Miscellaneous Tools for Chinese Text Mining and More

scancn: Read a Text File by Auto-Detecting Encoding
In chinese.misc: Miscellaneous Tools for Chinese Text Mining and More

Description

Usage

Arguments

Details

Value

Examples

Related to scancn in chinese.misc...

R Package Documentation

Browse R Packages

We want your feedback!

chinese.misc Miscellaneous Tools for Chinese Text Mining and More

scancn: Read a Text File by Auto-Detecting Encoding In chinese.misc: Miscellaneous Tools for Chinese Text Mining and More

Description

Usage

Arguments

Details

Value

Examples

Related to scancn in chinese.misc...

R Package Documentation

Browse R Packages

We want your feedback!

chinese.misc
Miscellaneous Tools for Chinese Text Mining and More

scancn: Read a Text File by Auto-Detecting Encoding
In chinese.misc: Miscellaneous Tools for Chinese Text Mining and More