STL: Seasonal and Trend Decomposition of a Time series via Loess

View source: R/STL.R

STLR Documentation

Seasonal and Trend Decomposition of a Time series via Loess

Description

Decompose a time series into seasonal, trend and irregular components using loess, a wrapper to provide additional information to the Base R stl function and accept more general input beyond the stl required time series object input. This function also accepts a Base R time series from the global environment as input, but also accepts data in the traditional x,y format where x is a variable of type Date. Moreover, the Date variable can be inferred from digital character string inputs. The time unit of the input dates can also be aggregated, such as changing monthly dates to quarterly dates.

Usage

STL(x, y=NULL, data=d, filter=NULL,
    ts_format=NULL, ts_unit=NULL, ts_agg=c("sum","mean"),
    show_range=FALSE, robust=FALSE, quiet=FALSE, do_plot=TRUE)

Arguments

x

Dates for the time series within a data frame, or a time series object created with the Base R function ts.

y

Numerical variable that is planted in the time series, not used if x is a time series object.

data

Data frame that contains the x and y variables. Default data frame is d.

filter

A logical expression that specifies a subset of rows of the data frame to analyze.

ts_format

A specified format (for R function as.Date()) that describes the values of the date variable on the x-axis, needed if the function cannot identify the correct date format to properly decode the given date values. For example, describe a character string date such as "09/01/2024" by the format "%m/%d/%Y". See details for more information.

ts_unit

Specify the time unit from which to plot a time series, plotted when the x-variable is of type Date. Default value is the time unit that describes the time intervals as they occur in the data. Aggregation according to the time unit will occur as specified, such as a daily time series aggregated to "months". Dates are currently stored as variable type Date() which stores information as calendar dates without times of the day. Valid values include: "days", "weeks", "months", "quarters", and "years", as well as "days7" to provide seasonality for daily data on a weekly instead of annual basis. Otherwise, for forecasting, the time unit for detecting seasonality will usually be '"months"' or '"quarters"'.

ts_agg

Function by which to aggregate over time according to ts_unit. Default is "sum" with an option for "means".

show_range

Display the range for each component.

robust

stl() parameter for a more robust solution.

quiet

If TRUE, no text output to the console.

do_plot

If FALSE, no plot.

Details

PURPOSE
Obtain and plot the seasonal, trend, and the irregular (remainder or residual) components of a time series using the Base R stl function. The corresponding plot is of four panels, one for the data and one each for the seasonal, trend, remainder components. Provide additional information comparing the relative sizes of the components in the form of the percent of variance of each component accounted for and the range of values of each component.

Seasonality is detected over a year, such as four quarters in a year or 12 months in a year. The exception is for daily data, for which seasonality can be indicated by the time unit of "days7", which will evaluate seasonality over the seven days of a week.

RANGE BARS
By definition, the data shows the most variability compared to the three components. If the four panels were scaled on the same y-axis, then the relative magnitude of the variations in each of the components, such as assessed by the ranges of each of their values, would be more directly observable. For example, if seasonality has no practical presence in the data, then the amplitude of the seasonal plot, the range of the seasonal component values, would be a small fraction the amplitude of the data plot, only reflecting random noise. Plotted on the same panel, the comparison would be direct.

Instead, however, the plots of the data and each of the three components are drawn such that each component is plotted on its own panel with its own scale with the most detail possible. The purpose of the range bars is to show a relative scale for comparison across the panels. Each range bar is a magnification indicator. The larger the bar, the more expanded is the corresponding panel, which means the smaller the variation of the component relative to the range of the data. Shrinking the size of a range bar along with the corresponding panel to the same size as the range bar for the data, the smallest range bar, would show the comparison directly.

DATE FORMAT
STL() makes reasonable attempt to decode a character string date value as the x-axis variable as read from a text data file such as a csv file. Some date formats are not available for conversion by default, such as date values that include the name of the month instead of its number. And, in general, there can be no guarantee that a date format is not miss inferred as they can be inherently ambiguous. If the default date conversion is not working, then manually supply the date format following one of the format examples in the following table according to the parameter ts_format.

Date Format
\"2022-09-01\" \"%Y-%m-%d\"
\"2022/9/1\" \"%Y/%m/%d\"
\"2022.09.01\" \"%Y.%m.%d\"
\"09/01/2022\" \"%m/%d/%Y\"
\"9/1/15\" \"%m/%d/%y\"
\"September 1, 2022\" \" %B %d, %Y\"
\"Sep 1, 2022\" \"%b %d, %Y\"
\"20220901\" \"%Y%m%d\"

For emphasis, each range bar is displayed in a pale yellow color.

Value

An stl() object and text to the console.

Here is an example of saving the output to an R object with any valid R name, such as s: s <- STL(Price). To see the names of the output objects for that specific analysis, enter names(s). To display any of the objects, precede the name with s$, such as to view the saved frequency distribution with s$out_freq. Or, only list the name of the output object to get the four output components displayed as a single data frame. View the output at the R console or within a markdown document that displays your results.

x.name: Name of the date variable on the horizontal axis. trend: Value of the trend component for each x-value. season: Value of the season component for each x-value. error: Value of the error component for each x-value.

Author(s)

David W. Gerbing (Portland State University; gerbing@pdx.edu)

See Also

stl.

Examples

# read the built-in data set dataStockPrice
d <- Read("StockPrice")
# extract just the data for Apple, the first 473 rows of data
d <- d[1:473,]

# manually request the STL for d
STL(Month, Price)

# enter a time series, here one that comes with Base R
# monthly average air temperatures in Nottingham, UK from 1920 to 1939
# get the time series into the global environment
my.ts <- nottem
STL(my.ts)

lessR documentation built on April 4, 2025, 12:31 a.m.