einterpolate: Interpolation

einterpolateR Documentation

Interpolation

Description

This function produces linear and spline interpolated values

Usage

einterpolate(
  data,
  vars.to.interpolate,
  group = NULL,
  time.var = NULL,
  time.values = NULL,
  extrapolate.spline = FALSE
)

Arguments

data

a data frame to interpolate. It must contain at least the variables we want to interpolate and the time.var, which marks the intervals we want to fill with interpolated value.

vars.to.interpolate

a string vector with the names of the variables to interpolate

group

a string vector with the names of the grouping variables. The interpolation will be conducted within each group. Note: if you are using time.values and extending the current data set, only the variables specified in group, vars.to.interpolate, and time.var will be extended to fill the values in time.values. The other variables will receive NA for the extended cases. If the data set is extended using time.values parameter and there are two categorical variables that represent the same group used to interpolated in different ways, it is possible to pass that variable in the parameter group as well to avoid NA values in the extended data set that is returned (see examples)

time.var

a string with the name of the variable indicating the interval in which the measurements of the vars.to.interpolate were collected or are missing. Usually it represents time (years, months) for which there are some NA values in the variables described in vars.to.interpolate, which we want to substitute for interpolated values.

time.values

either NULL (default) or a vector with values in the same unit of time.var. This range will be used to expand the data set and interpolate the values. If NULL, only time.var values are used to interpolate

extrapolate.spline

boolean, if TRUE the values intepolated using the spline function will also return extrapolted values

Details

The function linear interpolated values in a variable sufixed with .ili (interplation, linear) and interpolated values using the default method of the spline function. Those values are stored in a value suffixed with .isp (interpolated, spline)

Examples


library(magrittr)
library(ggplot2)
dat = tibble::data_frame(cat2 = c("a1", "a1", "b1", "a1", "b1", "a1", "b1"),
                         cat = c("a", "a", "b", "a", "b", "a", "b"),
                         yr = c(1980, 1990, 1987, 1993, 1990, 1999, 1999),
                         value1 = c(NA, 1, 10, NA, NA, 50, NA),
                         value2 = c(2, NA, 1, 10, 5, NA, 100)) 

einterpolate(dat, vars.to.interpolate=c('value1'), group='cat', time.var='yr')
## using pipe and interpolating multiple variables at once
dat %>% einterpolate(., vars.to.interpolate=c('value1', "value1"), group='cat',
                     time.var='yr')

## extending the data set
dat  %>%  einterpolate(., vars.to.interpolate=c('value1', "value2"),
                       group=c('cat'), time.var='yr', time.values = 1980:1999)

##  cat and cat2 are two ways to desctibe the same group. but in the previous code
## cat2 returns NA for the extended values. To avoid this:
dat  %>%  einterpolate(., vars.to.interpolate=c('value1', "value2"),
                       group=c('cat', 'cat2'), time.var='yr', time.values = 1980:1999)

## to extrapolate (only using spline)
dat  %>%  einterpolate(., vars.to.interpolate=c('value1', "value2"),
                       group=c('cat', 'cat2'), time.var='yr', time.values = 1980:1999,
                       extrapolate.spline=TRUE)

## see it visually:

v = dat %>%
    einterpolate(., vars.to.interpolate=c('value1', "value2"),
                 group=c('cat', 'cat2'), time.var='yr',
                 time.values = 1980:1999, extrapolate=TRUE)

## plot interpolated variable value2
v %>%
    dplyr::select(dplyr::contains("value2"), cat, yr)  %>%
    tidyr::gather(key = Method, value=value, -cat,  -yr) %>% 
    dplyr::mutate(labels =
            dplyr::case_when(
                   Method == "value2" ~ "Observed",
                   Method == "value2.ili" ~ "Interpolated (linear)",
                   Method == 'value2.isp' ~ 'Interpolated + Extrapolation (spline)') ) %>% 
    ggplot2::ggplot(.) +
    ggplot2::geom_point(aes(x=yr, y=value , colour=labels, size=labels), alpha=.3) +
    ggplot2::geom_line(aes(x=yr, y=value, group=labels, colour=labels)) +
    ggplot2::facet_wrap( ~ cat, ncol = , scales='free',labeller=label_parsed)  +
    ggplot2::scale_size_discrete(range=c(2,10), name='')+
    ggplot2::scale_colour_manual(
                values = c("Observed"= "red",
                "Interpolated (linear)" = "black",
                "Interpolated + Extrapolation (spline)"="lightblue"), name='') +
    ggplot2::theme_bw()+
    ggplot2::theme(legend.position = "bottom") +
    ggplot2::ggtitle("Variable: Value 2")

DiogoFerrari/edar documentation built on May 8, 2022, 8:26 a.m.