Most modern file systems store file-path components (names of directories and files) in a character encoding of wide scope: usually UTF-8 on a Unix-alike and UCS-2/UTF-16 on Windows. However, this was not true when R was first developed and there are still exceptions amongst file systems, e.g. FAT32.
This was not something anticipated by the C and POSIX standards which only provide means to access files via file paths encoded in the current locale, for example those specified in Latin-1 in a Latin-1 locale.
Everything here apart from the specific section on Windows is about Unix-alikes.
It is possible to mark character strings (elements of character
vectors) as being in UTF-8 or Latin-1 (see
This allows file paths not in the native encoding to be
expressed in R character vectors but there is almost no way to use
them unless they can be translated to the native encoding. That is of
course not a problem if that is UTF-8, so these details are really only
relevant to the use of a non-UTF-8 locale (including a C locale) on a
Functions to open a file such as
an error for non-native filepaths. Where functions look at existence
list.files, non-native filepaths are treated as
Many other functions use
gzfile to open their
file.path allows non-native file paths to be combined,
marking them as UTF-8 if needed.
path.expand only handles paths in the native encoding.
Windows provides proprietary entry points to access its file systems, and these gained ‘wide’ versions in Windows NT that allowed file paths in UCS-2/UTF-16 to be accessed from any locale.
Some R functions use these entry points when file paths are marked
as Latin-1 or UTF-8 to allow access to paths not in the current
encoding. These include
For functions using
tar), it is often possible to use a
connection wrapping a
Other notable exceptions are
system and file-path inputs for
Before R 4.0.0, file paths marked as being in Latin-1 or UTF-8 were silently translated to the native encoding using escapes such as <e7> or <U+00e7>. This created valid file names but maybe not those intended.
This document is still a work-in-progress.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.