Introduction

Actuaries and other financial analysts often work with data that is associated with a specific interval of time. Moreover, we often speak of a set of intervals, which are closed and have equivalent duration, e.g. a fiscal quarter or a fiscal year.

Though generation of a sequence of dates is straightforward, a sequence of intervals is less so. Further, a sequence of dates is not a sequence of intervals. When generating a sequence of intervals, the actuary has a choice of at least two options. One: generate a sequence of starting dates, with the length of the interval implied. Two: generate two sequences, so that the starting and ending dates are always available and the length of interval may be computed. In the former case, preparation of exhibits will require an additional step to store and/or recover the period. In the latter case, this is still so. In addition, it is possible for a user to alter the elements of one but not both sequences such that the resultant data is inconsistent. In either case, sequences may have data altered so that a rational set of intervals no longer holds.

In either case, there is no guarantee that the sequence of dates will be monotonically increasing, non-overlapping and that the interval will have a consistent period. Further, metadata describing the character of the time interval is stored separately. One is left juggling a bag of variables and ensuring that they are always consistent at any given time. The OriginPeriod class seeks to address these issues.

Data arising from intervals of time

Let's step back for a moment. Although actuaries often present a set of intervals, in general, we are only interested in a single interval which has been divided into sub-intervals. Examine, the timeline below.

BasicTimeline

In general, we want to be clear about the sub-interval to which information belongs. (Note that, in general, intervals are closed. More on this point below.) Each of these sub intervals has a start and end.

Data associated with one and only one interval of time are common in the actuarial field. Premium written or earned, claims incurred, or reported, claims settled, company expenses paid or incurred and many other variants exist. In each case, there is a numeric event which takes place at a specific instant of time. The particular instant is often irrelevant. Typically the day is all that matters and this only so that it may be assigned to a specific interval. Premium earned in a given year, claims incurred in that same year are obvious examples.

It is common to collect this information in a vector and assign it a label which may be easily understood. Incurred loss between accident years 2001 and 2010 would constitute a vector of ten elements. The start and end dates of the interval are implied. This may suffice, but it presumes that the time intervals are understood by the ultimate consumer of the information. If fiscal years begin on days other than January the first, the report is not clear. Consider the problem of assigning a claim to accident year 2001 based on the accident date. The integer 2001 must be converted to a date to determine whether a specific claim

AY <- 2001:2010
library(lubridate)
daysBetween <- as.integer(as.Date('2010-12-31') - as.Date('2001-01-01'))
randomDays <- as.Date('2001-01-01') + sample.int(daysBetween, size = 1000, replace=TRUE)
Claims <- data.frame(AccidentDate = randomDays, LossAmount = rlnorm(1000))

Let's return to our two solutions for storing time interval information. In the first case, we create a single vector which contains the starting points for every interval present, as follows:

ayStart <- seq.Date(from = as.Date("2001-01-01"), to = as.Date("2010-01-01"), by="year")

Inspection of the constructing function reveals that the dates are meant to encompass a single year. However, for this to be evident elsewhere, one must retrieve that original intent somehow. There are several ways to do this. One may code a value for the period and use that during the construction.

myPeriod <- "2 years"
ayStart <- seq.Date(from = as.Date("2001-01-01"), to = as.Date("2010-01-01"), by=myPeriod)

# Do some work

myPeriod

However, there is nothing to prevent the user from altering that value. Further, there is no consistent data type that one may use to store the information. When using seq.Date, one may use a character string or a difftime object. A character can't always be used in other date arithmetic.

One may forego storage of period information and infer the period based on the difference between successive elements of the vector. There are two problems with this approach. First, this must be done whenever one wants to use or report on the period. Second, the interval may not be consistent. Is it measured in days, months, quarters or years?

Finally, a vector has no protection from the user inadvertently modifying the original contents to generate spurious data. For example:

ayStart <- seq.Date(from = as.Date("2001-01-01"), length.out=10, by="year")
ayStart[2] <- as.Date("2005-07-01")

The second element now violates our original intent of a vector where each element is separated by one year.

The other approach alluded to above, wherein the period is implied by the difference between two vectors, suffers from many of these same problems. The start and end date are explicit and clear, however, they are not free from modification, nor is the interval guaranteed to be consistent, nor is the interval readily available without calculation.

OriginPeriod

The OriginPeriod object attempts to alleviate these concerns. An OriginPeriod is an S4 object, which enforces a robust and sensible structure for the kind of date intervals that actuaries typically work with. An OriginPeriod

Once an OriginPeriod is created, it may not be modified arbitrarily. Below, we try to alter the second element of the StartDate slot to something other than what we originally intended. We are not able to do so.

library(OriginPeriod)

op = OriginPeriod(as.Date("2001-01-01"), NumPeriods = 10)
op$StartDate[2] <- as.Date("2005-07-01")

Further, once created, we can't change the period.

op = OriginPeriod(as.Date("2001-01-01"), NumPeriods = 10)
op$Period <- as.period(2, "years")

Moniker

To ease preparation of reports, there is an additional slot for a "Moniker". This is a character label which may be used in plots or other exhibits.

grow

lubridate intervals

lubridate intervals are quite practical when one needs precise measures of time. However, they don't naturally construct as vectors and

library(lubridate)

start <- seq.Date(as.Date('2001-01-01'), by = "1 year", length.out=10)
end <- seq.Date(as.Date('2001-12-31'), by = "1 year", length.out=10)
myIntervals <- new_interval(start, end)

#View(myIntervals)

closed and open intervals

Actuarial information is commonly assigned to closed intervals. An accident period which begins January 1, 2001 and ends December 31, 2001 will include any claim which occurs on January 1, 2001, December 31, 2001 and any day between those two dates. The interval uses the dates at either end inclusively. Contrast this with an interval which is right open, which would include claims which occur before December 31, 2001 but not claims on that date.

rightOpen

The distinction between open and closed intervals is relevant when we need to think of time as a continuous measure. This is important for policies which expire at a specific instant of time (usually midnight). However, this is not often relevant for actuaries who are accustomed to thinking of time reckoned in days, or months or some other discrete quantity. For example, it's common to think of activity as happening "before the close of business" on a particular day. The precise instant of time is not material. (It's not material in this context. In others- for example risk management and mitigation- it may be highly material. Particular times of day may be more or less hazardous than others; rush hour traffic, for example. Also, as noted, it's very important to determine whether a policy will respond.)



PirateGrunt/OriginPeriod documentation built on May 8, 2019, 2:48 p.m.