OLS.reg: General purpose modelling using OLS(Ordinary Least Squares)...
In Chitran1987/StatsChitran: StatsChitran

OLS.reg

R Documentation

General purpose modelling using OLS(Ordinary Least Squares) regression

Description

Fits a function f(x_1, x_2, \dots, x_{n-1}, \vec{v} ) to a multivariate dataset \left(x_1, x_2, \dots , x_{n-1}, y \right) .
The vector \vec{v} contains the parameters that are to be extracted by the fitting procedure.
Uses OLS regression methodology.

Usage

OLS.reg(model, dat, guess, algorithm = 'Nelder-Mead', Delta = 10^-6)
OLS.reg(model, dat, guess, algorithm)
OLS.reg(model, dat, guess, Delta)
OLS.reg(model, dat, guess)

Arguments

`model`	Has to be a function, `f(x_1, x_2, \dots, x_{n-1}, \vec{v} )`, with a scalar numeric return type. Needs to have a vector, `\vec{v}`, as its LAST argument. All elements of this vector should be the individual scalar parameters that need to be extracted. The first `n - 1` arguments should be the independent varaibles or `x_k` that describe the model. See the formulation below `f(x_1, x_2, \dots, x_{n-1}, \vec{v} )` `\vec{v} = \left( v[1], v[2], \dots, v[m]\right)`
`dat`	Has to be the dataframe with `n` columns. The first `n - 1` columns should be the data for the first `x_k` variables IN THE ORDER in which the arguments are described in the model. The last column will be the actual `y` values of the data. Please see examples
`guess`	Has to be a numeric vector. The initial guess for the value of `\vec{v}`. Obviously its length should match the length of the vector `\vec{v}` passed in the model function.
`algorithm`	Denotes the optimization algorithm. A character-scalar which only values within the R-vector `c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN","Brent")` Defaults to `"Nelder-Mead"`. Uses the optim function for the optimization process.
`Delta`	Has to be a numeric scalar, `\delta`,. The peturbation used to calculate the gradient `\nabla f_{\vec{v}}` of the function, `f(x_1, x_2, \dots, x_{n-1}, \vec{v} )`. Passes `\delta` to `epsilon` or `\epsilon` while using grad.func to calculate `\nabla f_{\vec{v}}`. Defaults to `\delta = 10^{-6}`, but should be chosen with caution, see grad.func

Details

With f(x_1, x_2, \dots, x_{n-1}, \vec{v} ), make sure that the dat dataframe has columns x_1, x_2, \dots , x_{n-1}, y in the SAME ORDER.
Uses the simple squared error and mean squared error loss and cost functions.

l_i(\vec{v}) = \{ y_i - f(x_{i,1}, x_{i,2}, \dots x_{i, n-1}, \vec{v}) \}^2

C(\vec{v})= \frac{1}{N}{{\sum_i l_i(\vec{v})}}

Value

The reurned value is a 6 element list

par: The optimized/extracted parameters for the vector \vec{v}
value: The value of the cost function C(\vec{v}) at the optimized value of \vec{v}.
counts: This a 2-element vector. 1st elem: tells us the no. of times the function was evaluated, 2nd elem: gradient at that point. See optim
convergence: Is an integer value, 0 implies succesful convergence, 1 implies maximum no. of iterations reached prior to convergence, 10 implies degenerate problem detected. See optim.
message: helps diagnose convergence issues. See optim.
err: The residuals, e_i, after the fitting process, calculated at the optimized value of \vec{v}

e_i = y_i - f(x_{i,1}, x_{i,2}, \dots, x_{i,n-1}, \vec{v} )

Author(s)

Chitran Ghosal

Examples

#Build the dataframe for dat
X1 <- sort(rnorm(500))
X2 <- sort(rnorm(500))
Y <- 110*X1 + 120*X2
Y <- Y + rnorm(n = length(X1), mean = 0, sd = 10)
plot(X1, Y)
plot(X2, Y)
df <- data.frame(X1, X2, Y)


#Build the function for the model
lin.fun <- function(X1, X2, v){
  Y <- v[1]*X1 + v[2]*X2
  return(Y)
}


lst <- OLS.reg(model = lin.fun, dat = df, guess = c(50, 50), algorithm = 'BFGS')


library(StatsChitran)
plot(X1, Y)
lines(X1, lin.fun(X1, X2, v=lst$par), col='red', lwd=3)
plot(X2, Y)
lines(X2, lin.fun(X1, X2, v=lst$par), col='red', lwd=3)
lst$par

Chitran1987/StatsChitran documentation built on June 8, 2025, 2:24 a.m.