OLS.reg: General purpose modelling using OLS(Ordinary Least Squares)...

View source: R/OLS.reg.R

OLS.regR Documentation

General purpose modelling using OLS(Ordinary Least Squares) regression

Description

Fits a function f(x_1, x_2, \dots, x_{n-1}, \vec{v} ) to a multivariate dataset \left(x_1, x_2, \dots , x_{n-1}, y \right) .
The vector \vec{v} contains the parameters that are to be extracted by the fitting procedure.
Uses OLS regression methodology.

Usage

OLS.reg(model, dat, guess, algorithm = 'Nelder-Mead', Delta = 10^-6)
OLS.reg(model, dat, guess, algorithm)
OLS.reg(model, dat, guess, Delta)
OLS.reg(model, dat, guess)

Arguments

model

Has to be a function, f(x_1, x_2, \dots, x_{n-1}, \vec{v} ), with a scalar numeric return type.
Needs to have a vector, \vec{v}, as its LAST argument. All elements of this vector should be the individual scalar parameters that need to be extracted.
The first n - 1 arguments should be the independent varaibles or x_k that describe the model.
See the formulation below

f(x_1, x_2, \dots, x_{n-1}, \vec{v} )

\vec{v} = \left( v[1], v[2], \dots, v[m]\right)

dat

Has to be the dataframe with n columns.
The first n - 1 columns should be the data for the first x_k variables IN THE ORDER in which the arguments are described in the model.
The last column will be the actual y values of the data.
Please see examples

guess

Has to be a numeric vector.
The initial guess for the value of \vec{v}.
Obviously its length should match the length of the vector \vec{v} passed in the model function.

algorithm

Denotes the optimization algorithm.
A character-scalar which only values within the R-vector c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN","Brent")
Defaults to "Nelder-Mead".
Uses the optim function for the optimization process.

Delta

Has to be a numeric scalar, \delta,.
The peturbation used to calculate the gradient \nabla f_{\vec{v}} of the function, f(x_1, x_2, \dots, x_{n-1}, \vec{v} ).
Passes \delta to epsilon or \epsilon while using grad.func to calculate \nabla f_{\vec{v}}.
Defaults to \delta = 10^{-6}, but should be chosen with caution, see grad.func

Details

With f(x_1, x_2, \dots, x_{n-1}, \vec{v} ), make sure that the dat dataframe has columns x_1, x_2, \dots , x_{n-1}, y in the SAME ORDER.
Uses the simple squared error and mean squared error loss and cost functions.

l_i(\vec{v}) = \{ y_i - f(x_{i,1}, x_{i,2}, \dots x_{i, n-1}, \vec{v}) \}^2

C(\vec{v})= \frac{1}{N}{{\sum_i l_i(\vec{v})}}

Value

The reurned value is a 6 element list

  • par: The optimized/extracted parameters for the vector \vec{v}

  • value: The value of the cost function C(\vec{v}) at the optimized value of \vec{v}.

  • counts: This a 2-element vector. 1st elem: tells us the no. of times the function was evaluated, 2nd elem: gradient at that point. See optim

  • convergence: Is an integer value, 0 implies succesful convergence, 1 implies maximum no. of iterations reached prior to convergence, 10 implies degenerate problem detected. See optim.

  • message: helps diagnose convergence issues. See optim.

  • err: The residuals, e_i, after the fitting process, calculated at the optimized value of \vec{v}

    e_i = y_i - f(x_{i,1}, x_{i,2}, \dots, x_{i,n-1}, \vec{v} )

Author(s)

Chitran Ghosal

Examples

#Build the dataframe for dat
X1 <- sort(rnorm(500))
X2 <- sort(rnorm(500))
Y <- 110*X1 + 120*X2
Y <- Y + rnorm(n = length(X1), mean = 0, sd = 10)
plot(X1, Y)
plot(X2, Y)
df <- data.frame(X1, X2, Y)


#Build the function for the model
lin.fun <- function(X1, X2, v){
  Y <- v[1]*X1 + v[2]*X2
  return(Y)
}


lst <- OLS.reg(model = lin.fun, dat = df, guess = c(50, 50), algorithm = 'BFGS')


library(StatsChitran)
plot(X1, Y)
lines(X1, lin.fun(X1, X2, v=lst$par), col='red', lwd=3)
plot(X2, Y)
lines(X2, lin.fun(X1, X2, v=lst$par), col='red', lwd=3)
lst$par

Chitran1987/StatsChitran documentation built on Feb. 23, 2025, 8:30 p.m.