is.file.binary: Functions to assess if files are binary, DOS text or UNIX...

Description Usage Arguments Details Value Functions Author(s) See Also

View source: R/utils.R

Description

These functions attempt to determine if a file is binary or text. In addition, file.type attempts to determine the newline character(s) used in the file.

Usage

1
2
3
is.file.binary(file, bin.ints = c(1:8, 14:25), nbytes = 1000, nbin = 2)

file.type(file, bin.ints = c(1:8, 14:25), nbytes = 1000, nbin = 2)

Arguments

file

The path to the file to be examined

bin.ints

List of integers with the ASCII values of control characters that are to be considered when when looking for signs a file is binary. Default includes most ASCII control characters except things like NULL, LF, CR and HT that might actually appear in an ASCII file.

nbytes

Number of bytes to read in from the beginning of the file.

nbin

An integer indicating the threshold on the number of control characters above which a file is considered binary. Defaults to 2.

Details

A file is assessed to be binary using a heuristic based on finding more than nbin ASCII control (i.e., non-printing) characters in the first nbytes of the file. This works well for standard ASCII text, but it may be less effective for complex UTF8 text (e.g., Chinese).

For text files, line endings are assessed by file.type by searching first for DOS line endings (\r\n) in the first nbytes of the input file, and then by searching for UNIX line endings (\n). If neither is found, then NA_character_ is returned for the line ending.

Value

For is.file.binary, a boolean value is returned, whereas a list is returned for file.type.

Functions

Author(s)

David M. Kaplan dmkaplan2000@gmail.com

See Also

See also platform.newline.


knitrdata documentation built on Dec. 8, 2020, 5:08 p.m.