knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
In plant and animal breeding, quantitative traits (QTs) are expressions of genes distributed across the genome interacting with the environment. The phenotypic value of QTs ($y$) can be systematically partitioned into a genotypic component ($g$) and an environmental component ($e$):
$$ y = g + e $$
The primary goal in breeding is to maximize an individual's net genetic merit. The net genetic merit ($H$) is a linear combination of the unobservable true breeding values ($\mathbf{g}$) weighted by their respective economic values ($\mathbf{w}$):
$$ H = {\mathbf{w}}^{\prime}\mathbf{g} $$
Because the net genetic merit is unobservable in field trials, breeders construct a Linear Phenotypic Selection Index (LPSI) to predict it. The LPSI ($I$) is a linear combination of the observable and optimally weighted phenotypic trait values ($\mathbf{y}$) adjusted by index coefficients ($\mathbf{b}$):
$$ I = {\mathbf{b}}^{\prime}\mathbf{y} $$
The objective of the LPSI is to predict the net genetic merit and maximize the multi-trait selection response.
To identify the optimal parents for the next selection cycle, the correlation between the net genetic merit ($H$) and the LPSI ($I$) must be maximized. The vector $\mathbf{b}$ that simultaneously minimizes the mean squared difference between $I$ and $H$ and perfectly maximizes this correlation is mathematically derived as:
$$ \mathbf{b} = {\mathbf{P}}^{-1}\mathbf{Gw} $$
where: $\mathbf{P}$ is the phenotypic variance-covariance matrix. $\mathbf{G}$ is the genotypic variance-covariance matrix. * $\mathbf{w}$ is the vector of economic weights defining relative trait importance.
Once these optimal coefficients are derived, we can evaluate two fundamental parameters:
The Maximized Selection Response ($R_I$): The expected mean improvement in the net genetic merit due to indirect selection on the index. $$ {R}_I = {k}_I\sqrt{{\mathbf{b}}^{\prime}\mathbf{Pb}} $$
The Expected Genetic Gain Per Trait ($\mathbf{E}$): The multi-trait selection response broken down per individual trait. $$ \mathbf{E} = {k}_I\frac{\mathbf{Gb}}{\sigma_I} $$
where $k_I$ is the standardized selection intensity and $\sigma_I$ is the standard deviation of the index score variance.
We can seamlessly translate this text theory into rigorous statistical practice using the selection.index package. We will utilize the built-in synthetic datasets: maize_pheno (containing multi-environment phenotypic records for 100 genotypes) and maize_geno (500 SNP markers).
First, we estimate the genotypic ($\mathbf{G}$) and phenotypic ($\mathbf{P}$) variance-covariance matrices from our raw phenotypic dataset.
library(selection.index) # Load the synthetic phenotypic multi-environment dataset data("maize_pheno") # In maize_pheno: Traits are columns 4:6. # Genotypes are in column 1, and Block/Replication is in column 3. gmat <- gen_varcov(data = maize_pheno[, 4:6], genotypes = maize_pheno[, 1], replication = maize_pheno[, 3]) pmat <- phen_varcov(data = maize_pheno[, 4:6], genotypes = maize_pheno[, 1], replication = maize_pheno[, 3])
Next, we establish the relative economic priority of each trait. Economic weights ($\mathbf{w}$) explicitly define our strategic breeding objectives.
# Define the economic weights for the 3 continuous traits # (e.g., Yield, PlantHeight, DaysToMaturity) weights <- c(10, -5, -5)
With the covariance matrices and economic weights specified, we integrate them into the primary lpsi() function, which evaluates the combinatorial multi-trait selection indices efficiently.
# Calculate the Optimal Combinatorial Linear Phenotypic Selection Index (LPSI) index_results <- lpsi( ncomb = 3, pmat = pmat, gmat = gmat, wmat = as.matrix(weights), wcol = 1 )
Finally, we evaluate the theoretical gains. The lpsi() function returns a structured data frame containing the theoretical selection response ($R_I$) and other parameter estimates for all requested trait combinations.
# View the top combinatorial indices, including their selection response (R_A) head(index_results) # Extract the phenotypic selection scores to strategically rank the parental candidates # using the top evaluated combinatorial index scores <- predict_selection_score( index_results, data = maize_pheno[, 4:6], genotypes = maize_pheno[, 1] ) # View the top performing candidates designated for the next breeding cycle head(scores)
The classical linear selection index theories seamlessly extend to marker-assisted genomic selection. If you have genome-wide marker profiles for your genotypes, you can incorporate them to estimate the Linear Marker Selection Index (LMSI).
# Load the associated synthetic genomic dataset (500 SNPs for the 100 genotypes) data("maize_geno") # Calculate the marker-assisted index combining our matrices and raw SNP profiles marker_index_results <- lmsi( pmat = pmat, gmat = gmat, marker_scores = maize_geno, wmat = weights ) summary(marker_index_results)
In scenarios where the phenotypic ($\mathbf{P}$) and genotypic ($\mathbf{G}$) matrices are poorly estimated (e.g., due to limited data), the true optimal coefficients ($\mathbf{b}$) can be systematically biased. The Base Index provides a robust, non-optimized alternative where coefficients are set strictly equal to the fixed economic weights ($I_B = \mathbf{w}'\mathbf{y}$).
# Calculate the Base Index and automatically compare its efficiency to the LPSI base_results <- base_index( pmat = pmat, gmat = gmat, wmat = weights, compare_to_lpsi = TRUE ) # Observe the expected genetic gains and efficiency comparison base_results$summary
The theory demonstrates that the correlation between the net genetic merit ($H$) and the expected index ($I$) differs from the traditional index heritability mathematically ($h^2_I \neq \rho^2_{HI}$). The lpsi() function intrinsically estimates both of these fundamental statistics:
# Extract the top combinatorial index results top_index <- index_results[1, ] # h^2_I: Heritability of the optimal index top_index$hI2 # \rho_HI: Correlation between the LPSI and the true underlying Net Genetic Merit top_index$rHI
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.