Base R provides two broad classes for representing dates, POSIXt and Date. Objects of these classes conform with international standards^[Refer to http://www.iso.org/iso/home/standards/iso8601.htm] by marking a "day" with the instant of time that begins the day. It is straightforward to calculate elapsed time in units of "days" using those objects. However, frequent business-use cases express elapsed time in units of "months" or "years" and the definition of "month" or "year" in units of "days" is ambiguous.

The "mondate" package adopts a different perspective. It is based on three principles:

  1. A "month" is marked with the instant in time that "closes business" that month.
  2. A "day" is marked with the instant in time that "closes business" that day and is represented by the portion of the month that has completed by the end of that day.
  3. A "year" equals twelve "months".

mondate objects constructed according to these principles make it straightforward to calculate elapsed time in units of "months" and "years."

The purpose of this vignette is to highlight the usefulness of the "mondate" package in everyday and business situations that inherently express time in units of "months" or "years." Technical looks "under the mondate hood" will only be touched on for clarification.

The four major benefits of the mondate package are:

  1. Date Aging
  2. Date Formatting
  3. Date Sequencing
  4. Date Cutting

1. Date Aging

The "age" of an event plays many important roles in business use cases. By default, Date objects are measured in units of "days" and POSIXt objects in units of "seconds". But sometimes it is more convenient to measure elapsed time in units of "months" or "years". This is where mondate comes in.

Example 1

If my "birth event" took place on February 29, 1996 then my age on February 28, 2006 was 10:

require(mondate)
YearsBetween("1996-02-29", "2006-02-28")

or in the US

YearsBetween("2/29/1996", "2/28/2006")

and

MonthsBetween("2/29/1996", "2/28/2006")

which also results when subtracting two mondates

m1 <- mondate.ymd(1996, 2)
m2 <- mondate.ymd(2006, 2)
m2 - m1

Example 2

Suppose ABC Company invoices a customer in late October 2015 and has a policy of recognizing that invoice to have been sent on the 1st of November. This code calculates the ages of that invoice in months as of the ends of 2015 and 2016:

invoiceDate = as.Date("2015-11-01")
asof <- mondate.ymd(2015:2016)
ages <- asof - invoiceDate
print(ages)

Example 3

The last example in this section is actuarial in nature. Suppose ABC Insurance Co. stores the date of insured losses in the variable (or data base field) DateOfLoss. Here are 10 random dates of loss after the end of 2010:

# generate 10 random dates after 2010
set.seed(1)
z <- rexp(10, .1)
DateOfLoss <- as.Date(mondate.ymd(2010) + z)
names(DateOfLoss) <- paste0("Claim", 1:10)
print(DateOfLoss)

Here are the four quarter-end "as-of dates" in 2013:

# Quarter-ends in 2013
asof <- mondate.ymd(2013, 3 * 1:4)
names(asof) <- paste0("Q", 1:4)
print(asof)

Comment:
"names" were assigned to DateOfLoss and asof so that R will automatically embellish the data.frame below with row and column headers.

Here are the ages of the 10 losses as of each quarter end, stored in a data.frame with each claim's date of loss:

# a matrix of ages in units of months
Ages <- round(sapply(asof, `-` , DateOfLoss), 1)
# code ages as "not available" if the evaluation date preceeds 
# the Date of Loss (one instance)
Ages[Ages <= 0] <- NA
data.frame(DateOfLoss, Ages)

Date Arithmetic

mondates can act arithmetically in (almost always) the same way their underlying numeric can act.^[The underlying numeric measures the number of months since the close of business 1999-12-31 (".mondate.origin").] In particular, use subtraction to measure the magnitude of the interval between two dates in units of months.

For example, the following two calculations yield the same result:

mondate("2015-12-31") - mondate("2014-12-31")
mondate("2015-12-31") - as.Date("2015-01-01")

Why are the results identical even though the subtrahends would appear to be a day apart? The answer is that the two objects, as.Date("2015-01-01") and mondate("2014-12-31") represent the same instant in time, i.e., the moment that separates events occuring in 2014 from events occurring in 2015. This points out a new feature in mondate v1.0.

star Dates can be subracted directly from mondates.

2. Date Formatting

mondate enables dates to be read and displayed in more than one format. The built-in formats that are automatically recognized are currently

  1. USa: "%m/%d/%Y"
  2. USb: "%m-%d-%Y"
  3. EUa: "%Y-%m-%d"
  4. EUb: "%Y/%m/%d"

The order can be changed and new formats added using base::options for display ("writing") and set.mondate.displayFormats for "reading".

"writing": dynamic format display

By default, mondate looks at your value of Sys.getlocale("LC_TIME") at startup. If "United States" appears, then USa format is selected, otherwise, EUa is selected. This default can be changed globally for all mondate objects in the session or for indivisual objects.

Example 4

This vignette is being written in the US, so today's date, r format(Sys.Date(), "%B %d, %Y"), will be represented using the first format above by default:

mondate(Sys.Date())

That default can be changed to the international standard format^[ibid.] "YYYY-MM-DD" using base::options and the name mondate.default.displayFormat:

options(mondate.default.displayFormat = "%Y-%m-%d")
mondate(Sys.Date())

Example 5

French users of the format "dd/mm/YYYY" can establish that default as follows:

options(mondate.default.displayFormat = "%d/%m/%Y")
mondate(Sys.Date())

star The options approach modifies the default display format for all mondates in the R session. To set the display format for just one instance of a mondate object, use the displayFormat argument during the object's creation.

Example 6

Here we create the first 6 month-ends of 2015 to be displayed in the French format above despite the fact that the default format is first changed to the ISO standard:

options(mondate.default.displayFormat = "%Y-%m-%d")
mondate(Sys.Date())
m <- mondate.ymd(2015, 1:6, displayFormat = "%d/%m/%Y")
print(m)
options(mondate.displayFormat = NULL)

More creative formats can be used, as for instance to display just the year and month, as was done in "Example 3" above.

options(mondate.default.displayFormat = NULL)

"reading": dynamic format detection

As previously mentioned, the mondate package has the four formats paste(get.mondate.displayFormats(), collapse=", ") for detecting whether a character string represents a "date." To inform mondate of another format for converting character to date, use set.mondate.displayFormats^[This function sets the options value of mondate.displayFormats] with the value(s) of your choice.

Example 7

To set the French format "dd/mm/yyyy" as Priority One for detecting dates, add that format to the head of the current list of detectable formats. The
code below accomplishes that.^[This example is given in ?set.mondate.displayFormats]:

set.mondate.displayFormats(c("%d/%m/%Y", 
                             get.mondate.displayFormats()), 
                           clear = TRUE)

Contining, suppose dates in a spreadsheet are saved to a csv file in France and the read.csv function results in this data.frame:

data <- data.frame(
  cbind(Invoice=c("A", "B", "C"),
        datechar = c("28/11/2015", "29/11/2015", "30/11/2015")))
print(data)

The character dates can be converted automatically to Date objects via mondate as follows

data$InvoiceDate <- as.Date(mondate(data$datechar))
print(data)

For more information on the codes to use when formatting dates, see the R help page for the strptime function. To add addtional defaults according to your value of Sys.getlocale("LC_TIME"), contact the author^[chiefmurphy at gmail]. (All are welcome to visit the package's public repository at https://github.com/chiefmurph/mondate.)

3. Date Sequencing

Sequences of dates in units of days or weeks is easily accomplished using the base R's Date class:

seq(as.Date("2015-11-01"), by = "day", length.out = 5)

Month-sequences can similarly be generated with Dates, which does work well for most dates. Results can be disappointing, however, for dates near the end of the month. Compare these two sequences starting from the first and last days of January:

seq(as.Date("2015-01-01"), by = "month", length.out = 5)
seq(as.Date("2015-01-31"), by = "month", length.out = 5)

All dates in the first sequence are the first days of the month, but some dates in the second sequence "leak" into subsequent months. This behavior is well documented in R help^[see ?seq.POSIXt]:

Using "month" first advances the month without changing the day: if this results in an invalid day of the month, it is counted forward into the next month

Perhaps the major purpose of the mondate package is to avoid this shortcoming.^["Under the hood" mondate represent dates relative to the percent of the month that has transpired by the close of business that day. Kudos to Damien Laker! See the "Thank You" section at the end.]

Example 8

Sequences of month ends can be accomplished in various "mondate" ways. Here are two:

seq(mondate("2015-01-31"), by = "month", length.out = 5)
mondate.ymd(2015, 1:5)

The display format in the first sequence inherits from the format of the character representation of the beginning date. The display format in the second sequence is based on the author's locale (see "Date Formatting" section above). Also note that each of the objects generated above are of class "mondate".

star It is often more convenient to generate month sequences from Date objects, and produce Date objects, without having to resort to a mondate object in between. For that purpose the seqmondate generic function was written.

seqmondate

seqmondate(x) generates sequences of class(x) for a variety of classes: Date, POSIXt, and mondate. For any other class(x) seqmondate(x) will produce a sequence of mondates, if possible. By default, 'by = "month"' is assumed.

Example 9

The first sequence below generates the same month-ends as in Example 8 but this time the objects generated are Dates.

(d <- seqmondate(as.Date("2015-01-01"), length = 5))
class(d)

Example 10: Year-ends

It has been said that there are always multiple ways to do things in R and this is no exception. Here are two ways to generate sequences of year-end dates.

seqmondate("2010-12-31", by = "year", length = 6)
mondate.ymd(2010:2015)

star If month- and year-end dates are intended to represent "as of" dates, it is preferable to create them as mondate objects rather than, say, Date objects if those dates will be used for "date aging" in units of months/years.

4. Date Cutting

Sidebar on "cut" for numerics
A cut of a numeric 'x' is a collection of (half-open,half-closed] intervals that "cover" 'x'. By "cover" is meant that every value in 'x' is contained in some interval^[thus, not an "open cover" in the topological sense], with the exception that R excludes the minimum value of 'x' by default.^[unless you set include.lowest = TRUE] By default, the right endpoint is assumed to be closed.^[change the interval to left-closed by setting right = FALSE]

A cut in R is represented by a factor. The cut function elegantly enunciates the numeric intervals by clearly identifying the (open,closed] borders in the factor's levels.

R Definition: A cut of a set of dates 'x' by "months" is a collection of contiguous months such that every date in x is contained in some month.

This correspondence between a date and its neighboring members in its 'cut' can be an important factor in the statistical analysis of events occurring during similar time periods.

There is a cut method for mondates when the 'breaks' argument is

First we will define some cuts. Then we will see how one might use a cut.

Example 11

Because a mondate is fundamentally a numeric^[the "mondate class" is defined via setClass("mondate", contains = "numeric", etc.], the following two commands -- the first on numeric, the second on mondate -- are fundamentally the same. The only difference is how the cuts' levels display.

cut(seq(from = 180.5, to = 185.5, by = .5), breaks = 180:186)
cut(seq(from = mondate(180.5), to = mondate(185.5), by = .5), breaks = 180:186)

In the month intervals above, if one were to label the interval with one of the endpoints, it seems natural to choose the closed endpoint. That is the 'mondate' convention when 'breaks' is character. This bears repeating:

starThe 'mondate' convention is to label a character cut (breaks = "days", "months", ...) with the closed endpoint of the interval. As with cut.default, the closed endpoint is determined by the argument right: when TRUE the right endpoint labels the interval, when FALSE the left endpoint labels the interval.

We begin with examples of mondate cuts, with breaks being numeric and character.

Example 12

The following two commands generate the same cut. The first explicitly sets the break points with the month-ends beginning 2014-12-31 and ending six months later. The second implicitly sets the same break points. As with cut.default, the labels of the first cut clearly enunciate the (open,closed] monthly intervals. The labels of the second cut only display the closed endpoint.

cut(seq(mondate("2015-01-15"), mondate("2015-06-15"), by = .5), 
    breaks = mondate.ymd(2014) + 0:6)
cut(seq(mondate("2015-01-15"), mondate("2015-06-15"), by = .5), breaks = "month",
    include.lowest = TRUE)

In the case that breaks is character it is unfortunate to have to set include.lowest = TRUE, opposite its default value FALSE.^[Excluding the minimum value of 'x' would be somewhat random -- forgive the colloquialism -- given that other values of 'x' are likely to in the same time interval. In the case of character breaks, include.lowest=FALSE throws an error.] Other cut arguments as well have default values that may seem counterintuitive for cutting dates.

Perhaps the most troubling default is right = TRUE for Date objects because it violates the basic principle that Date objects begin on, and can be considered synonymous with, the instant beginning the day, i.e., the left endpoint. For those and other reasons a new cutmondate generic exists in mondate v1.0.

star The cutmondate work on Date, mondate, and other objects with arguments that are more appropriate for their class.

Additionally, three new arguments were added to cut.mondate, which we will cover in due course.

We now turn our attention to the cutmondate methods.

cutmondate

The 'cutmondate' collection of methods are most effective when 'breaks' defines a cover in terms of months or multiple months.

star When the object being cut is a Date or POSIXt, the breakpoints are assumed to begin the period, right = FALSE by default, and the levels are labeled by the first date in the period. If necessary, set labels to the right endpoint with right = TRUE.

Example 13

Here we regenerate the same DateOfLoss dates from Example 3, and cut them into month intervals.

set.seed(1)
z <- rexp(10, .1)
monDOL <- mondate.ymd(2010) + z
DateOfLoss <- as.Date(monDOL)
print(DateOfLoss)
cutmondate(DateOfLoss)

The "28 Levels" says that it takes 28 contiguous months to cover 'DateOfLoss'. Note that the levels are labeled with the first day of each month because in this case right=FALSE by default. Specify right=TRUE and the levels are labeled by the last day of the month, which occurs by default in the second, mondate case below:

cutmondate(DateOfLoss, right = TRUE)
cutmondate(monDOL)

Before tackling the final examples, it is important to point out three new arguments for cut.mondate (and therefore for cutmondate) that do not appear in cut.default or cut.Date:

See the help for cut.mondate for details behind these arguments. We will show use of the first and third.

Fiscal Years

star The 'startmonth' argument enables fiscal year cuts!

Example 14

Suppose ABC's fiscal year is July 1 through June 30. The dates of loss in the previous examples can be cut into fiscal years by setting startmonth = 7.

Here we show two ways to cut DateOfLoss by fiscal year, the choice depending on whether the company identifies its fiscal year with the beginning day or ending day of the period.

cutmondate(DateOfLoss, breaks = "year", startmonth = 7)
cutmondate(DateOfLoss, breaks = "year", startmonth = 7, right = TRUE)

Continuing, suppose ABC Company conventually refers to a fiscal year not by "the beginning date" but by the first calendar year. The dates can be cut and the abbreviated labels automatically generated in the single function call

cutmondate(mondate(DateOfLoss, displayFormat = "%Y"), 
           breaks = "year", right = FALSE, startmonth = 7)

In the final example we aggregate and plot data by fiscal year.

Example 15

ABC Insurance Co. records loss amounts associated with the dates of loss at regular intervals. Suppose the amounts as of 2016-06-30 are

(LossAmount <- round(rnorm(10, 1000, 100), -1))

ABC's actuaries want to aggregate loss by age. The C-Suite wants aggregations by fiscal year. Everyone wants visuals! No problem.

First, cut the loss dates into fiscal years (FY), but this time also use attr.breaks = TRUE.

star To make the break points available for subsequent date aging, set attr.breaks = TRUE in cutmondate.

FY <- cutmondate(mondate(DateOfLoss, displayFormat = "%Y"), 
           breaks = "year", right = FALSE, startmonth = 7,
           attr.breaks = TRUE)
asof <- mondate.ymd(2016, 6)
age <- asof - attr(FY, "breaks")[FY]
(data <- data.frame(DateOfLoss, LossAmount, FY, FYage = age))

Calculate loss totals -- here we use aggregate -- and plot those totals -- here we use base R graphics. The first plot is by FY, the second by FY age.

(LossByFY <- aggregate(LossAmount ~ FY, data, sum))
barplot(LossByFY$LossAmount, names.arg = LossByFY$FY,
        ylab = "$thousands",
        xlab = "Fiscal Year",
        main = paste("Total Loss\n As of", asof),
                     col = "blue")
# and
(LossByFYage <- aggregate(LossAmount ~ FYage, data, sum))
barplot(LossByFYage$LossAmount, names.arg = LossByFYage$FYage,
        ylab = "$thousands",
        xlab = "Fiscal Year Age (months)",
        main = paste("Total Loss\n As of", asof),
                     col = "blue")

If fiscal year coincides with calendar year (January 1 through December 31) then base R's cut method for Date works well for calculating accident year (AY), and calculating age by month is even more straightforward.

Example 16

ABC wants to calculate the age of the losses as of the end of 2016. First, cut the DateOfLoss's into accident year using base cut. Using the fact that base cut will label each cut with the first day of the period, subtract that date from the asof("2016-12-31") yields the age in months:

AY <- cut(DateOfLoss, breaks = "years")
AYage <- mondate("2016-12-31") - as.Date(AY)
print(AYage)

```

Summary

The mondate package represents dates in a way that enables date aging and sequencing in a mathematically "invertible" manner.^[Invertible in the sense that retracing a monthly sequence from the end should produce the same sequence but in reverse order. That is not always possible with Date sequences by "month". Consider that
seq(as.Date("2015-01-31"), by = "month", length.out = 2)
yields
"2015-01-31" "2015-03-03"
whereas
seq(as.Date("2015-03-03"), by = "-1 month", length.out = 2)
yields
"2015-03-03" "2015-02-03"]

A mondate object is not appropriate for all situations. For example, a mondate halfway through the month of February falls on the close of business of the 14th day (in most cases) but falls on the 15th day for April. If a time period other than month or year is more suited to the situation, use an R object other than mondate.

Factors that associate similar events by date can be created by R's cut methods. A cut method exists for mondate objects but the preferable function to use for cutting those and other date-representing objects (Date, POSIXt) is cutmondate because the arguments default to values intuitively appropriate for the object. The "startmonth" arguments allows the creation of fiscal year cuts.

Thank you

Many thanks to the R Development team for their work on Date and POSIXt objects and methods.

A special thank you goes out to Gabor Grothendieck for his suggestion of, and help with, cut.mondate.

Finally, the "mondate perspective" was motivated by Damien Laker in his somewhat obscure 2008 paper Time Calculations for Annualizing Returns: The Need for Standardization^[The Journal of Performance Measurement, Summer 2008] where he states the obvious :-)

"Annualization calculations based on whole months never wind up accidentally calculating that a year is anything other than a year long."

Mr. Laker's Recommended Method is based on two cases:^[Ibid. Although the terms "start on" and "finishes on" are not specifically defined in Mr. Laker's paper, the spirit is intuitively understood and reflected in the package.]

  1. "For any period that starts and finishes on the last day of a month, the time calculation can be done entirely in months."
  2. "In cases where the start date or end date is not the last day of a month, it will be necessary to count a partial month."

The mondate package embraces this end-of-business, month-centric, a-year-equals-twelve-months perspective.



chiefmurph/mondate documentation built on Aug. 29, 2022, 4:13 p.m.