growth_rate | R Documentation |
Estimates the growth rate of a signal at given points along the underlying sequence. Several methodologies are available; see the growth rate vignette for examples.
growth_rate(
x = seq_along(y),
y,
x0 = x,
method = c("rel_change", "linear_reg", "smooth_spline", "trend_filter"),
h = 7,
log_scale = FALSE,
dup_rm = FALSE,
na_rm = FALSE,
...
)
x |
Design points corresponding to the signal values |
y |
Signal values. |
x0 |
Points at which we should estimate the growth rate. Must be a
subset of |
method |
Either "rel_change", "linear_reg", "smooth_spline", or "trend_filter", indicating the method to use for the growth rate calculation. The first two are local methods: they are run in a sliding fashion over the sequence (in order to estimate derivatives and hence growth rates); the latter two are global methods: they are run once over the entire sequence. See details for more explanation. |
h |
Bandwidth for the sliding window, when |
log_scale |
Should growth rates be estimated using the parametrization
on the log scale? See details for an explanation. Default is |
dup_rm |
Should we check and remove duplicates in |
na_rm |
Should missing values be removed before the computation? Default
is |
... |
Additional arguments to pass to the method used to estimate the derivative. |
The growth rate of a function f defined over a continuously-valued parameter t is defined as f'(t) / f(t), where f'(t) is the derivative of f at t. To estimate the growth rate of a signal in discrete-time (which can be thought of as evaluations or discretizations of an underlying function in continuous-time), we can therefore estimate the derivative and divide by the signal value itself (or possibly a smoothed version of the signal value).
The following methods are available for estimating the growth rate:
"rel_change": uses (B/A - 1) / h, where B is the average of y
over the
second half of a sliding window of bandwidth h centered at the reference
point x0
, and A the average over the first half. This can be seen as
using a first-difference approximation to the derivative.
"linear_reg": uses the slope from a linear regression of y
on x
over a
sliding window centered at the reference point x0
, divided by the fitted
value from this linear regression at x0
.
"smooth_spline": uses the estimated derivative at x0
from a smoothing
spline fit to x
and y
, via stats::smooth.spline()
, divided by the
fitted value of the spline at x0
.
"trend_filter": uses the estimated derivative at x0
from polynomial trend
filtering (a discrete spline) fit to x
and y
, via
genlasso::trendfilter()
, divided by the fitted value of the discrete
spline at x0
.
An alternative view for the growth rate of a function f in general is given
by defining g(t) = log(f(t)), and then observing that g'(t) = f'(t) /
f(t). Therefore, any method that estimates the derivative can be simply
applied to the log of the signal of interest, and in this light, each
method above ("rel_change", "linear_reg", "smooth_spline", and
"trend_filter") has a log scale analog, which can be used by setting
log_scale = TRUE
.
For the local methods, "rel_change" and "linear_reg", we use a sliding window
centered at the reference point of bandiwidth h
. In other words, the
sliding window consists of all points in x
whose distance to the
reference point is at most h
. Note that the unit for this distance is
implicitly defined by the x
variable; for example, if x
is a vector of
Date
objects, h = 7
, and the reference point is January 7, then the
sliding window contains all data in between January 1 and 14 (matching the
behavior of epi_slide()
with before = h - 1
and after = h
).
For the global methods, "smooth_spline" and "trend_filter", additional
arguments can be specified via ...
for the underlying estimation
function. For the smoothing spline case, these additional arguments are
passed directly to stats::smooth.spline()
(and the defaults are exactly
as in this function). The trend filtering case works a bit differently:
here, a custom set of arguments is allowed (which are distributed
internally to genlasso::trendfilter()
and genlasso::cv.trendfilter()
):
ord
: order of piecewise polynomial for the trend filtering fit. Default
is 3.
maxsteps
: maximum number of steps to take in the solution path before
terminating. Default is 1000.
cv
: should cross-validation be used to choose an effective degrees of
freedom for the fit? Default is TRUE
.
k
: number of folds if cross-validation is to be used. Default is 3.
df
: desired effective degrees of freedom for the trend filtering fit. If
cv = FALSE
, then df
must be a positive integer; if cv = TRUE
, then
df
must be one of "min" or "1se" indicating the selection rule to use
based on the cross-validation error curve: minimum or 1-standard-error
rule, respectively. Default is "min" (going along with the default cv = TRUE
). Note that if cv = FALSE
, then we require df
to be set by the
user.
Vector of growth rate estimates at the specified points x0
.
# COVID cases growth rate by state using default method relative change
cases_deaths_subset %>%
group_by(geo_value) %>%
mutate(cases_gr = growth_rate(x = time_value, y = cases))
# Log scale, degree 4 polynomial and 6-fold cross validation
cases_deaths_subset %>%
group_by(geo_value) %>%
mutate(gr_poly = growth_rate(x = time_value, y = cases, log_scale = TRUE, ord = 4, k = 6))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.