knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The linear phenotypic selection index (LPSI), the null restricted LPSI (RLPSI), and the predetermined proportional gains LPSI (PPG-LPSI) are the main phenotypic selection indices used to predict the net genetic merit and select parents for the next selection cycle. The LPSI is an unrestricted index, whereas the RLPSI and the PPG-LPSI allow restrictions equal to zero and predetermined proportional gain restrictions respectively to be imposed on the expected genetic gain values of the trait.
One additional restricted index is the desired gains LPSI (DG-LPSI), which does not require economic weights and, in a similar manner to the PPG-LPSI, allows restrictions to be imposed on the expected genetic gain values of the trait based on a predetermined level.
In this vignette, we demonstrate the theory and practical application of RLPSI, PPG-LPSI, and DG-LPSI using the selection.index package.
We use the synthetic maize_pheno dataset built into the package to calculate the phenotypic and genotypic variance-covariance matrices.
library(selection.index) # Load the built-in maize phenotype dataset data("maize_pheno") # Extract the traits of interest traits <- c("Yield", "PlantHeight", "DaysToMaturity") # Calculate Genotypic (gmat) and Phenotypic (pmat) covariance matrices gmat <- gen_varcov(maize_pheno[, traits], maize_pheno$Genotype, maize_pheno$Block) pmat <- phen_varcov(maize_pheno[, traits], maize_pheno$Genotype, maize_pheno$Block) # Define economic weights for the three traits wmat <- weight_mat(data.frame(Trait = traits, Weight = c(1, -1, 1)))
The main objective of the RLPSI is to optimize, under some null restrictions, the selection response, to predict the net genetic merit $H = \mathbf{w}'\mathbf{g}$, and to select the individuals with the highest net genetic merit values as parents of the next generation. The RLPSI allows restrictions equal to zero to be imposed on the expected genetic gains of some traits.
Assuming we want to minimize the mean squared difference between the index $I$ and merit $H$ ($E[(H - I)^2]$) under the restriction $\mathbf{C}'\mathbf{b} = \mathbf{0}$, we minimize the following function:
$$ \Psi \left(\mathbf{b},\mathbf{v}\right)={\mathbf{b}}^{\prime}\mathbf{Pb}+{\mathbf{w}}^{\prime}\mathbf{Gw}-2{\mathbf{w}}^{\prime}\mathbf{Gb}+2{\mathbf{v}}^{\prime }{\mathbf{C}}^{\prime}\mathbf{b} $$
where $\mathbf{C}' = \mathbf{U}'\mathbf{G}$, $\mathbf{U}'$ is the matrix of null restrictions, and $\mathbf{v}$ is a vector of Lagrange multipliers. The solution yields the RLPSI vector of coefficients:
$$ {\mathbf{b}}_R=\mathbf{Kb} $$
where $\mathbf{K} = [\mathbf{I} - \mathbf{Q}]$, $\mathbf{Q} = \mathbf{P}^{-1}\mathbf{C}(\mathbf{C}'\mathbf{P}^{-1}\mathbf{C})^{-1}\mathbf{C}'$, and $\mathbf{b} = \mathbf{P}^{-1}\mathbf{Gw}$ is the LPSI vector of coefficients.
Using the rlpsi function, we can constrain the genetic gain of certain traits to be strictly zero. For instance, we may want to maximize overall gain while ensuring that Plant Height (trait #2) does not change.
# Restrict trait 2 (PHT) to have ZERO expected genetic gain rlpsi_res <- rlpsi( pmat = pmat, gmat = gmat, wmat = wmat, restricted_traits = c(2) ) # View the summary and coefficients print(rlpsi_res$summary) # View the expected genetic gains (Delta_G) print(rlpsi_res$Delta_G)
Notice that the expected genetic gain (Delta_G) for the 2nd trait is effectively zero.
Unlike the RLPSI, the predetermined proportional gains phenotypic selection index (PPG-LPSI) allows restrictions different from zero to be imposed, ensuring that traits gain in strictly predefined proportions to each other.
Minimizing $E[(H - I)^2]$ under the proportional restriction $\mathbf{M}'\mathbf{b} = 0$, where $\mathbf{M}' = \mathbf{D}'\mathbf{C}'$ and $\mathbf{D}'$ is a Mallard (1972) matrix, leads to:
$$ \Phi \left(\mathbf{b},\mathbf{v}\right)={\mathbf{b}}^{\prime}\mathbf{Pb}+{\mathbf{w}}^{\prime}\mathbf{Gw}-2{\mathbf{w}}^{\prime}\mathbf{Gb}+2{\mathbf{v}}^{\prime }{\mathbf{M}}^{\prime}\mathbf{b} $$
The vector that minimizes the mean squared error under this restriction is:
$$ {\mathbf{b}}_M={\mathbf{K}}_M\mathbf{b} $$
where $\mathbf{K}_M = [\mathbf{I} - \mathbf{Q}_M]$ and $\mathbf{Q}_M = \mathbf{P}^{-1}\mathbf{M}(\mathbf{M}'\mathbf{P}^{-1}\mathbf{M})^{-1}\mathbf{M}'$. Alternatively, Tallis (1985) formulated it using a proportionality constant $\theta$:
$$ {\mathbf{b}}_{\mathrm{T}}={\mathbf{b}}_R+\uptheta \boldsymbol{\updelta} $$
The ppg_lpsi function enforces these desired proportional gains. Suppose we want the gains across the three traits to follow the proportion 2 : 1 : 1.
# Specify the desired proportions k_proportions <- c(2, 1, 1) # Calculate the PPG-LPSI ppg_res <- ppg_lpsi(pmat = pmat, gmat = gmat, k = k_proportions, wmat = wmat) # View the expected genetic gains print(ppg_res$Delta_G)
If we observe the resulting Delta_G values, their relative proportions should approximate the 2 : 1 : 1 ratio dictated by k.
The desired gains linear phenotypic selection index (DG-LPSI) is unique in that it does not require economic weights $\mathbf{w}$. Instead, the breeder specifies the exact desired target genetic gains $\mathbf{d}$ directly.
Because the expected genetic gain is $\mathbf{E} = k_I \frac{\mathbf{Gb}}{\sigma_I}$, if we set $\mathbf{Gb} = \mathbf{d}$, then we want to minimize the variance of the index $\sigma_I$ subject to $\mathbf{Gb} = \mathbf{d}$ to maximize $\mathbf{E}$:
$$ {\Phi}_{DG}\left(\mathbf{b},\mathbf{v}\right)=\mathbf{0.5}\left({\mathbf{b}}^{\prime}\mathbf{Pb}\right)+{\mathbf{v}}^{\prime}\left(\mathbf{Gb}-\mathbf{d}\right) $$
Solving this yields the DG-LPSI vector of coefficients:
$$ {\mathbf{b}}_{DG}={\mathbf{P}}^{-1}\mathbf{G}{\left({\mathbf{GP}}^{-1}\mathbf{G}\right)}^{-1}\mathbf{d} $$
We can run the dg_lpsi function solely based on desired absolute gains. Let's aim to increase Yield by 5 units, decrease Plant Height by 2 units, and increase Days to Maturity by 1 unit.
# Explicit vector of desired absolute genetic gains desired_gains <- c(5, -2, 1) # Calculate DG-LPSI dg_res <- dg_lpsi(pmat = pmat, gmat = gmat, d = desired_gains) # Check the achieved proportional genetic gains print(dg_res$Delta_G) # The DG-LPSI also calculates implied Smith-Hazel economic weights print(dg_res$implied_weights_normalized)
The output gives us the selection index coefficients $\mathbf{b}$ that would achieve the target gains, as well as the implied economic weights showing the "absolute cost" or relative importance required to achieve those gains.
Imposing restrictions on an index inherently limits its potential to maximize the overall selection response $R_R$. The selection.index package allows you to evaluate this "cost of restriction" by calculating both the relative selection efficiency (PRE - Percentage of Response Efficiency) and the heritability (hI2) of the constrained index versus the unconstrained equivalent.
When reviewing the summary data frame output of our indices above:
- rHI: The correlation between the constrained index and the true net genetic merit ($H$). As more restrictions are added, this correlation decreases.
- PRE: Shows the expected genetic advance relative to a base index. Constrained indices will universally show lower PRE than their unconstrained LPSI counterpart.
By default, rlpsi() and ppg_lpsi() accept user-friendly arguments like restricted_traits = c(2) or k = c(2, 1, 1) to automatically construct the mathematical constraint matrices behind the scenes.
However, advanced users may wish to manually define these matrices, particularly for complex scenarios.
For the RLPSI, the constraint matrix $\mathbf{C}$ is a $t \times r$ matrix (where $t$ is the number of traits and $r$ is the number of restrictions). Each column represents a restricted trait with a 1 at the restricted index and 0s elsewhere:
# Manually restrict traits #1 and #3: # Create a 3x2 constraint matrix C_matrix <- matrix( c( 1, 0, 0, # Restrict trait 1 0, 0, 1 ), # Restrict trait 3 nrow = 3, ncol = 2 ) # Pass directly to RLPSI rlpsi_manual <- rlpsi(pmat = pmat, gmat = gmat, wmat = wmat, C = C_matrix) print(rlpsi_manual$Delta_G)
The ppg_lpsi() function automatically manages the complex proportional constraint matrix based on your inputs $\mathbf{k}$, guaranteeing the proportional relationship $\Delta \mathbf{G} = \phi\mathbf{k}$.
While these indices are powerful predictive models, their real-world applicability operates under the limitations established by classical breeding theory (Hazel, 1943):
The selection.index package provides highly flexible options beyond the simple unrestricted index. The RLPSI is useful when strictly zero-change bounds are needed on specific traits. The PPG-LPSI caters to breeders focusing on a strict multi-trait improvement ratio. Finally, the DG-LPSI provides a powerful alternative for scenarios where economic weights are notoriously difficult to estimate, but ideal target gains are known. When using these constrained models, breeders must continuously balance their specific phenotypic goals against the intrinsic statistical costs to overall genetic efficiency.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.