The SFSI R-package solves penalized regression problems offering tools for the solutions to penalized selection indices. In this repository we maintain the latest (developing) version.
Last update: Jun 24, 2024
Installation of SFSI package requires a R-version ≥ 3.6.0
From CRAN (stable version)
install.packages('SFSI',repos='https://cran.r-project.org/')
From GitHub (developing version)
install.packages('remotes',repos='https://cran.r-project.org/') # 1. install remotes
library(remotes) # 2. load the library
install_github('MarcooLopez/SFSI') # 3. install SFSI from GitHub
A selection index (SI) predicts the genetic value ($u_i$) of a candidate of selection for a target trait ($y_i$) as the weighted sum of $p$ measured traits $x_{i1},\dots,x_{ip}$ as:
$$ \color{NavyBlue}{\hat{u} i = \boldsymbol{x}{i}'\boldsymbol{\beta}_i} $$
where $\boldsymbol{x} i = (x{i1},\dots,x_{ip})'$ is the vector of measured traits and $\boldsymbol{\beta} i = (\beta{i1},\dots,\beta_{ip})'$ is the vector of weights.
The weights are derived by minimizing the optimization problem:
$$ \color{NavyBlue}{\hat{\boldsymbol{\beta}} i = \text{arg min}{\frac{1}{2}\mathbb{E}(u_i - \boldsymbol{x}{i}'\boldsymbol{\beta}_i)}} $$
This problem is equivalent to:
$$ \color{NavyBlue}{\hat{\boldsymbol{\beta}} i = \text{arg min}[\frac{1}{2}\boldsymbol{\beta}' i\textbf{P} x\boldsymbol{\beta} i - \textbf{G}'_ {xy}\boldsymbol{\beta}_i]} $$
where $\textbf{P} x$ is the phenotypic variance-covariance matrix of predictors and $\textbf{G}{xy}$ is a vector with the genetic covariances between predictors and response. Under standard assumptions, the solution to the above problem is
$$ \color{NavyBlue}{\hat{\boldsymbol{\beta}} i = \textbf{P}^{-1} x\textbf{G}_{xy}} $$
In the sparse selection index (SSI), the weights are derived by imposing a sparsity-inducing penalization in the above optimization function as
$$ \color{NavyBlue}{\hat{\boldsymbol{\beta}} i = \text{arg min}[\frac{1}{2}\boldsymbol{\beta}' i\textbf{P} x\boldsymbol{\beta} i - \textbf{G}'_{xy}\boldsymbol{\beta}_i + \lambda f(\boldsymbol{\beta}_i)]} $$
where $\lambda$ is a penalty parameter and $f(\boldsymbol{\beta}_i)$ is a penalty function on the weights. A value of $\lambda = 0$ yields the coefficients for the standard selection index. Commonly used penalty functions are based on the L1- (i.e., LASSO) and L2-norms (i.e., Ridge Regression). Elastic-Net considers a combined penalization of both norms,
$$ \color{NavyBlue}{f(\boldsymbol{\beta} i) = \alpha\sum^p{j=1}|\beta_{ij}| + (1-\alpha)\frac{1}{2}\sum^p_{j=1}\beta^2_{ij}} $$
where $\alpha$ is a number between 0 and 1. The LASSO and Ridge Regression appear as special cases of the Elastic-Net when $\alpha = 1$ and $\alpha = 0$, respectively.
Functions LARS()
and solveEN()
can be used to obtain solutions for $\hat{\boldsymbol{\beta}} i$ in the above penalized optimization problem taking $\textbf{P} x$ and $\textbf{G}_{xy}$ as inputs. The former function provides LASSO solutions for the entire $\lambda$ path using Least Angle Regression (Efron et al., 2004), and the later finds solutions for the Elastic-Net problem for given values of $\alpha$ and $\lambda$ via the Coordinate Descent algorithm (Friedman, 2007).
Application with high-throughput phenotypes: Lopez-Cruz et al. (2020). [Manuscript]. [Documentation].
Application to Genomic Prediction: Lopez-Cruz and de los Campos (2021). [Manuscript]. [Documentation].
The SFSI R-package contains a reduced version of the full data used in Lopez-Cruz et al. (2020) for the development of penalized selection indices. This full data can be found in this repository.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.