Information about time zones in R.
the name of the current time zone.
1 2 3
logical: should an attempt be made to find the location name as used in the Olson/IANA database? (See ‘Time zone names’ below.)
Time zones are a system-specific topic, but these days almost all R platforms use similar underlying code, used by Linux, macOS, Solaris, AIX, FreeBSD, Sun Java >= 1.4 and Tcl >= 8.5, and installed with R on Windows. Unfortunately there are many system-specific errors in the implementations. It is possible to use R's own version of the code on Unix-alikes as well as on Windows: this is the default for macOS and recommended for Solaris.
It should be possible to set the time zone via the environment
variable TZ: see the section on ‘Time zone names’ for
Sys.timezone() will return the value of
TZ if set (and on some OSes it is always set), otherwise it will
try to retrieve a value which if set for TZ would give the
current time zone. This is not in general possible, and
Sys.timezone(FALSE) on Windows will retrieve the abbreviation
used for the current time.
If TZ is set but empty or invalid, most platforms default to UTC, the time zone colloquially known as GMT (see https://en.wikipedia.org/wiki/Coordinated_Universal_Time). (Some but not all platforms will give a warning for invalid time zones.)
Time zones did not come into use until the second half of the nineteenth century and were not widely adopted until the twentieth, and daylight saving time (DST, also known as summer time) was first introduced in the early twentieth century, most widely in 1916. Over the last 100 years places have changed their affiliation between major time zones, have opted out of (or in to) DST in various years or adopted DST rule changes late or not at all.
A quite common system implementation of
POSIXct is as signed
32-bit integers and so only goes back to the end of 1901: on such
systems R assumes that dates prior to that are in the same time zone
as they were in 1902. Most of the world had not adopted time zones by
1902 (so used local ‘mean time’ based on longitude) but for a
few places there had been time-zone changes before then. 64-bit
representations are becoming common; unfortunately on some 64-bit OSes
(notably macOS) the database information is 32-bit and so only
available for the range 1901–2038, and incompletely for the end
Sys.timezone returns an OS-specific character string, possibly
NA or an empty string (which on some OSes means UTC).
For the default
location = TRUE this will be a location such as
"Europe/London" if one can be ascertained. For
= FALSE this may be an abbreviation such as
"CEST" on Windows.
OlsonNames returns a character vector.
"UTC" and its synonym
"GMT" are accepted on all
Where OSes describe their valid time zones can be obscure. The help
for the C function
tzset can be helpful, but it can also be
inaccurate. There is a cumbersome POSIX specification (listed under
environment variable TZ at
which is often at least partially supported, but there are other more
user-friendly ways to specify time zones.
Almost all R platforms make use of a time-zone database originally
compiled by Arthur David Olson and now managed by IANA, in which the
preferred way to refer to a time zone is by a location (typically of a
Pacific/Easter. Some traditional designations are also allowed
GB. (Beware that some of these
designations may not be what you expect: in particular
EST is a
time zone used in Canada without daylight saving time, and not
EST5EDT nor (Australian) Eastern Standard Time.) The
designation can also be an optional colon prepended to the path to a
file giving complied zone information (and the examples above are all
files in a system-specific location). See
http://www.twinsun.com/tz/tz-link.htm for more details and
references. By convention, regions with a unique time-zone history
since 1970 have specific names in the database, but those with
different earlier histories may not. Each time zone has one or two
(the second for DST) abbreviations used when formatting times.
The abbreviations used have changed over the years: for example France used PMT (‘Paris Mean Time’) from 1891 to 1911 then WET/WEST up to 1940 and CET/CEST from 1946. (In almost all time zones the abbreviations have been stable since 1970.) The POSIX standard allows only one or two abbreviations per time zone, so you may see the current abbreviation(s) used for older times.
OlsonNames returns the time-zone names known to
the Olson/IANA database on the current system. The system-specific
location in the file system varies, e.g. ‘/usr/share/zoneinfo’
(Linux, macOS, FreeBSD), ‘/usr/share/lib/zoneinfo’ (Solaris, AIX),
.... It is likely that there is a file named something like
‘zone.tab’ under that directory listing the locations known as
time-zone names (but not for example
EST5EDT): this is read by
OlsonNames. See also
Where R was configured with option --with-internal-tzcode
(the default on macOS and Windows: recommended on Solaris), the database at
file.path(R.home("share"), "zoneinfo") is used by default: file
‘VERSION’ in that directory states the version. Environment
variable TZDIR can be used to point to a different
‘zoneinfo’ directory: this is also supported by the native
services on some OSes, e.g. Linux).
Most platforms support time zones of the form GMT+n and GMT-n, which assume at a fixed offset from UTC (hence no DST). Contrary to some expectations (but consistent with names such as PST8PDT), negative offsets are times ahead of (east of) UTC, positive offsets are times behind (west of) UTC.
Immediately prior to the advent of legislated time zones, most people used time based on their longitude (or that of a nearby town), known as ‘Local Mean Time’ and abbreviated as LMT in the databases: in many countries that was codified with a specific name before the switch to a standard time. For example, Paris codified its LMT as ‘Paris Mean Time’ in 1891 (to be used throughout mainland France) and switched to GMT+0 in 1911.
Some systems (notably Linux) have a
tzselect command which
allows the interactive selection of a supported time zone name.
There is a system-specific upper limit on the number of bytes in (abbreviated) time-zone names which can be as low as 6 (as required by POSIX). Some OSes allow the setting of time zones with names which exceed their limit, and that can crash the R session.
Since 2007 there has been considerable disruption over changes to the timings of the DST transitions, aimed at energy conservation. These often have short notice and time-zone databases may not be up to date. (Morocco in 2013 announced a change to the end of DST at a days notice, and in 2015 North Korea gave imprecise information about a change a week in advance.)
On platforms with case-insensitive file systems, time zone names will be
case-insensitive. They may or may not be on other platforms and so,
"gmt" is valid on some platforms and not on others.
Note that except where replaced, the operation of time zones is an OS service, and even where replaced a third-party database is used and can be updated (see the section on ‘Time zone names’). Incorrect results will never be an R issue, so please ensure that you have the courtesy not to blame R for them.
1 2 3