Description Usage Arguments Details Value Functions Author(s) See Also
These functions attempt to determine if a file is binary or text. In
addition, file.type
attempts to determine the newline character(s)
used in the file.
1 2 3 |
file |
The path to the file to be examined |
bin.ints |
List of integers with the ASCII values of control characters that are to be considered when when looking for signs a file is binary. Default includes most ASCII control characters except things like NULL, LF, CR and HT that might actually appear in an ASCII file. |
nbytes |
Number of bytes to read in from the beginning of the file. |
nbin |
An integer indicating the threshold on the number of control characters above which a file is considered binary. Defaults to 2. |
A file is assessed to be binary using a heuristic based on finding more than
nbin
ASCII control (i.e., non-printing) characters in the first
nbytes
of the file. This works well for standard ASCII text, but it
may be less effective for complex UTF8 text (e.g., Chinese).
For text files, line endings are assessed by file.type
by searching
first for DOS line endings (\r\n
) in the first nbytes
of the
input file, and then by searching for UNIX line endings (\n
). If
neither is found, then NA_character_
is returned for the line ending.
For is.file.binary
, a boolean value is returned, whereas a
list is returned for file.type
.
is.file.binary
: A boolean that will be TRUE
if a file is considered to be
binary.
file.type
: Returns a list with up to two elements:
type
& newline
. type
can either by "binary"
or
"text"
. newline
will be NULL
for binary files,
"\r\n"
for DOS formatted text files, "\n"
for UNIX
formatted text files and NA_character_
for text files without any
newline characters in the first nbytes
of the file.
David M. Kaplan dmkaplan2000@gmail.com
See also platform.newline
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.