How to start"
In flexFitR: Flexible Non-Linear Least Square Model Fitting

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Getting started

The basic idea of this vignette is to illustrate to users how to use the flexFitR package. We'll start with a very basic example: a simple linear regression. Although this example is not the primary focus of the package, it will serve to demonstrate its use.

1. Simple linear regression

In this example, we'll work with a small dataset consisting of 6 observations, where X is the independent variable and Y is the dependent variable.

library(flexFitR)
library(dplyr)
library(ggpubr)

dt <- data.frame(X = 1:6, Y = c(12, 16, 44, 50, 95, 100))
plot(explorer(dt, X, Y), type = "xy")

First, we define an objective function. In this case, the function fn_lm will represent the linear regression, where b is the intercept and m is the slope of the regression.

fn_lm <- function(x, b, m) {
  y <- b + m * x
  return(y)
}

The plot_fn function, which is integrated into the package, allows us to plot any function with the parameters provided. This is useful for visualizing the shape of the function before fitting the model to the data.

plot_fn(fn = "fn_lm", params = c(b = 10, m = 5))

To fit the model, we use the modeler function. In this function, we pass x as the independent variable, y as the dependent variable, and then a vector of parameters where we assign initial values to our coefficient b and coefficient m.

mod <- dt |>
  modeler(
    x = X,
    y = Y,
    fn = "fn_lm",
    parameters = c(b = -5, m = 10)
  )
mod

Once the model is fitted, we can examine the output, extract the estimated parameters, make some plots, and predict new x values.

a <- plot(mod, color = "blue", title = "Raw data")
b <- plot(mod, type = 4, n_points = 200, color = "black")
ggarrange(a, b)

In order to get the coefficients with their variance-covariance matrix we make use of the coef and vcov function, which only takes the model object as an argument.

coef(mod)

vcov(mod)

Finally, we can make predictions using the predict function, which takes the fitted model as an object and X as the value for which we want to make the prediction.

predict(mod, x = 4.5)

We can compare this with the lm function in R, which will give results similar to those obtained with our package.

Comparison with `lm`

mo <- lm(Y ~ X, data = dt)

summary(mo)$coefficients

vcov(mo)
predict(mo, newdata = data.frame(X = 4.5), se.fit = TRUE)

While the previous example was fairly simple, we can consider a more complex scenario where we need to fit not just one function, but hundreds of functions for several groups. This can be achieved using the grp argument in the modeler function. Additionally, we can parallelize these processes by setting the parallel argument to TRUE and defining the number of cores to use.

It’s important to note that depending on the functions defined by the user, some parameters may need to be constrained, such as being required to be greater than or less than zero. In other cases, certain parameters might need to be fixed at known values. In these more complex situations, where we have many curves to fit and are working with complex functions—whether non linear regressions with specific parameter constraints or cases where some parameters are fixed for each group—modeler offers extensive flexibility.

2. Piece-wise regression

The following example, although still simple, represents a slightly more complex function with a greater number of parameters. In this case, we have a piece-wise regression, parameterized by t1, t2, and k, and defined by the following expression:

fun <- function(t, t1 = 45, t2 = 80, k = 0.9) {
  ifelse(
    test = t < t1,
    yes = 0,
    no = ifelse(
      test = t >= t1 & t <= t2,
      yes = k / (t2 - t1) * (t - t1),
      no = k
    )
  )
}

Before fitting the model, let's take a look at the example dataset.

dt <- data.frame(
  time = c(0, 29, 36, 42, 56, 76, 92, 100, 108),
  variable = c(0, 0, 0.67, 15.11, 77.38, 99.81, 99.81, 99.81, 99.81)
)
plot(explorer(dt, time, variable), type = "xy")

We can make a plot of the piecewise function and then fit the model using the modeler function.

plot_fn(fn = "fun", params = c(t1 = 25, t2 = 70, k = 90))

mod_1 <- dt |>
  modeler(
    x = time,
    y = variable,
    fn = "fun",
    parameters = c(t1 = 40, t2 = 70, k = 100)
  )
mod_1

After fitting the model, we can examine the results, plot the fitted curve, extract the coefficients and their associated p-values, obtain the variance-covariance matrix, and make predictions for unknown values of x.

plot(mod_1)

# Coefficients
coef(mod_1)

# Variance-Covariance Matrix
vcov(mod_1)

# Making predictions
predict(mod_1, x = 45)

Comparison with `nls`

mod_nls <- dt |>
  nls(
    formula = variable ~ fun(time, t1, t2, k),
    start = c(t1 = 40, t2 = 70, k = 100),
    algorithm = "default"
  )
summary(mod_nls)
coef(mod_nls)
vcov(mod_nls)
predict(mod_nls, newdata = data.frame(time = 45))

As we can see, we get very similar results when compared to nls.

Finally, we will illustrate how to provide different initial values to the function when dealing with multiple groups, and we will also show how to fix some parameters of the objective function.

Providing Initial values

In this example, we don't have a grouping variable. However, by default, the function assigns a unique identifier (uid) to the dataset. Because of this, we need to specify uid = 1 for the initial values and fixed parameters. If there is only one group, you only need to modify the parameters argument accordingly. This approach is primarily for illustrative purposes.

init <- data.frame(uid = 1, t1 = 20, t2 = 30, k = 0.8)

mod_2 <- dt |>
  modeler(
    x = time,
    y = variable,
    fn = "fun",
    parameters = init
  )
mod_2
coef(mod_2)

Fixing parameters

fix <- data.frame(uid = 1, k = 98)

mod_3 <- dt |>
  modeler(
    x = time,
    y = variable,
    fn = "fun",
    parameters = c(t1 = 40, t2 = 70, k = 100),
    fixed_params = fix
  )
mod_3
coef(mod_3)
plot(mod_3)

performance(mod_1, mod_2, mod_3)

This vignette provided a basic introduction to using the flexFitR package, starting with simple examples such as linear regression and piecewise regression. The goal was to demonstrate the fundamental features and flexibility of the package. However, more complex situations can arise when working with high-throughput phenotypic (HTP) data, which involve multiple groups, parameter constraints, and advanced modeling scenarios. These more complex situations are illustrated in the other vignettes, which use real HTP data to showcase the full capabilities of the flexFitR package.