fitting | R Documentation |
Useful for boiling down the multiple permutations after quake()
.
fit_lm( df, y = y, formula_rhs = x_new, by = NULL, x_pred = cutter_number(df, 30, x_new), k = k, .keep_mod = FALSE, ... ) fit_gam( df, y = y, formula_rhs = s(x_new, bs = "cs"), by = NULL, x_pred = cutter_number(df, 30, x_new), k = k, .keep_mod = FALSE, ... ) fit_loess( df, y = y, formula_rhs = x_new, by = NULL, x_pred = cutter_number(df, 30, x_new), k = k, .keep_mod = FALSE, ... )
df |
|
y |
colname of the value of interest. Default to |
formula_rhs |
the right hand-side of the |
by |
colname for grouping structure (besides |
x_pred |
the sequence of x for which to predict new values after fitting |
k |
colname for iteration . Default to |
.keep_mod |
logical whether to keep regression models and index of fit quality. See Details |
... |
additional parameters to be passed to main fitting functions. See Examples |
x_pred
can be passed directly as a vector of numeric. By default,
cutter_number(df, 30, x_new)
is used so that 30 groups of approximately
equal numbers of observations are made using the new x values, ie likely to cover
the range of possible x values.
formula_rhs
correspond to the right hand side of the formula
. It you want to
fit using, say y~x_new
, this corresponds only to x_new
. This saves typing y~
because it is already provided by the y
argument. You can pass any stats::formula()
you want.
For instance x_new^2 + x_new -1
would correspond to a quadratic model with no intercept.
See stats::formula()
to stick with this grammar.
a tibble()
fit_lm
: fitting using linear model
fit_gam
: fitting using generalised additive models with smoothness estimation
fit_loess
: fitting using local polynomial regression
These helpers wrap stats::loess()
, stats::lm()
, mgcv::gam()
and make
the assumption that you know what you are doing: ie the models are appropriate
and their conditions respected. They all :
fit a model on each permutation (and group-wise if by
is provided).
predict new y values using stats::predict on this model and for fixed x_pred
values.
The main motivation is to generalize on many permutations what you would do on original data:
find a model that fits them and try to represent trends using it. In practice, when you're
adding ggplot2::geom_smooth to a graph, that's exactly what you do, even if ggplot2
can largely
take care of the modelling aspect.
The second motivation is to boil down the randomized x values into fixed x values,
that you provide to the function with the argument x_pred
, and for which
predicted y
values will be calculated using stats::predict. The price to pay is
to fit models but the benefit is that you cant test differences between groups
more easily for each of these fixed x
values, that discretize your temporal range.
Besides testing, graphical comparisons are eased too, because desriptors such as extrema, other intermediate quantiles, average, etc. can be calculated for these fixed x-values. This is the purpose of the main spaghetti plot.
The last (and not least) motivation is to assess the quality of such fitting functions.
Using .keep_mod=TRUE
gives a glimpse to it by providing indices such as
the adjusted r^2 (for lm
and gam
), and the residual standard deviation (for loess
)
for each permutation. But it also retain each model so that you can access other components
of interest such as coefficients, p-values, etc.
Note that you do not need fitting to discretize values, you can also bin()
them
on requested x slices. That still allows testing minus the model intermediate.
Finally, if you only want a graphical approach, you can directly use spaghetti0
directly after quake()
# Show fitter on a single iteration df <- animals_q %>% dplyr::filter(k==1, taxa=="cat") plot(df$x_new, df$value, pch=20, cex=0.2) x_pred <- seq(-100, 100, 10) # use fitting functions lm_fit <- fit_lm(df, y=value, x_pred=x_pred, span=0.5) # you can specify arguments loess_fit <- fit_loess(df, y=value, x_pred=x_pred) gam_fit <- fit_gam(df, y=value, x_pred=x_pred) # now draw them lines(lm_fit$x_pred, lm_fit$y_pred, col="firebrick3") lines(loess_fit$x_pred, loess_fit$y_pred, col="orange") lines(gam_fit$x_pred, gam_fit$y_pred, col="blue") # usually, you would use them on a full object returned by quake # here we show the first 6 permutations along with their loess cat <- animals_q %>% dplyr::filter(k<=6, taxa=="cat") cat_lines <- cat %>% fit_loess(y=value, x_pred=x_pred) ggplot2::ggplot() + ggplot2::geom_point(mapping=ggplot2::aes(x=x_new, y=value), data=cat, size=0.2) + ggplot2::geom_line(mapping=ggplot2::aes(x=x_pred, y=y_pred, group=k), data=cat_lines, col="red") + ggplot2::facet_wrap(~k, ncol=3) + ggplot2::theme_minimal()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.