knitr::opts_chunk$set( collapse = TRUE, comment = NA, warning = FALSE, message = FALSE )
library(FielDHub) library(magrittr) library(knitr) library(kableExtra)
This vignette shows how to generate un-replicated designs leveraging the sparse allocation method
by using the FielDHub Shiny App and the scripting function sparse_allocation()
from the FielDHub
R package.
Sparse allocation is a valuable strategy in plant breeding experiments, as it allows researchers to evaluate a large number of treatments over multiple locations in a single experiment. Sparse allocation can increase efficiency and reduce the number of experimental units required, making it a cost-effective option. One standard method for implementing sparse allocation in plant breeding experiments is incomplete block designs (IBD) [@EdmondsonIBDs].
The following key points summarize the advantages and disadvantages of sparse allocation [@MontesinosLopezSparse]:
Increased efficiency: By using sparse allocation, breeders can evaluate a large number of genotypes or treatments in a single experiment across multiple environments, which can accelerate the breeding program and reduce the time and resources needed for evaluation.
Selection intensity: The large number of genotypes or treatments evaluated in sparse allocation experiments can increase the genetic diversity in the breeding program and increase the chances of identifying superior genotypes or treatments.
Cost-effective: Sparse allocation experiments are generally less expensive compared to fully replicated experiments since fewer experimental units are needed.
Less accurate predictions: The limited number of experimental units means that the estimates of treatment effects are less precise compared to fully replicated designs. However, an increase in selection intensity may compensate for the loss of accuracy (Trade off problem).
FielDHub includes a function to run the sparse allocation strategy and the multi-location randomization, as well as an interface for creating a sparse allocation design on the FielDHub app.
The plant breeding project aims to test 260 entries across five environments, but due to limited seed availability, only four replications for each genotype can be created across all five locations. As a result, not all genotypes will be present in all environments. Additionally, the project includes four checks that will be replicated in all environments. To address the seed shortage, the sparse allocation strategy will be used.
The table below illustrates the allocation of the first ten genotypes across the five environments.
allocation <- data.frame( ENV1 = c(1, 0, 1, 1, 1, 1, 1, 0, 1, 1), ENV2 = c(1, 1, 1, 1, 1, 1, 0, 1, 1, 1), ENV3 = c(1, 1, 0, 1, 1, 0, 1, 1, 1, 1), ENV4 = c(1, 1, 1, 1, 1, 1, 1, 1, 0, 0), ENV5 = c(0, 1, 1, 0, 0, 1, 1, 1, 1, 1) ) rownames(allocation) <- paste0("Gen-", 1:10)
allocation %>% kbl() %>% kable_styling()
The table illustrates the allocation of genotypes across different environments, with genotypes listed in rows and environments in columns. Specifically, it indicates that Genotype 1 (Gen-1) has been assigned to locations 1, 2, 3, and 4, but not to environment 5. The process of allocating genotypes to locations is achieved through an optimization process that employs IBD principles.
To launch the app you need to run either
FielDHub::run_app()
or
library(FielDHub) run_app()
Once the app is running, go to Unreplicated Designs > Sparse Allocation
Then, follow the following steps where we will show how to generate a sparse allocation experiment.
Import entries' list? Choose whether to import a list with entry numbers and names for genotypes
or treatments.
If the selection is No
, the app will generate synthetic data for entries and
names of the treatment/genotypes based on the user inputs.
If the selection is Yes
, the entries list must fulfill a specific format and must be a .csv
file. The file must have the columns ENTRY
and NAME
. The ENTRY
column must have a unique
integer number entry for each treatment/genotype. The column NAME
must have a unique name
that identifies each treatment/genotype. Both ENTRY
and NAME
must be unique, duplicates
are not allowed. The following table shows an example of the entries list format.
Any checks must appear in the first rows of the .csv
file.
ENTRY <- 1:9 NAME <- c(c("CHECK1","CHECK2","CHECK3"), c(paste0("Genotype", LETTERS[1:6]))) df <- data.frame(ENTRY,NAME)
df %>% kbl() %>% kable_styling()
Enter the number of entries/treatments in the Input # of Entries box, which is 260 in our case.
Select 4 from the drop-down on the Input # of Checks box.
Since we want to run this experiment over 5 locations, set Input # of Locations to 5.
Set the number of copies of each treatment in the # of Copies Per Entry dropdown box to 4.
Select serpentine
or cartesian
in the Plot Order Layout. For this example we will use
the serpentine
layout.
To ensure that randomizations are consistent across sessions, we can set a seed number in the box
labeled Random Seed. For instance, we will set it to 16
.
Enter the name for the experiment in the Input Experiment Name box. For example, SparseTest2023
.
Enter the starting plot number in the Starting Plot Number box. In this experiment we want the plot
start at 1, 1001, 2001, 3001, 4001
for each location.
Enter the name of the site/location in the Input the Location box. For this experiment we will
set the sites as FARGO, CASSELTON, MINOT, PROSPER, WILLISTON
.
Once we have entered all the information for our experiment on the left side panel, click the Run! button to run the design.
You will then be prompted to select the dimensions of the field from the list of options in the
drop-down in the middle of the screen with the box labeled Select dimensions of field.
In our case, we will select 16 x 15
. We also can see the table with the sparse allocation.
Click the Randomize! button to randomize the experiment with the set field dimensions and to see the output plots.
If you change any of the inputs on the left side panel after running an experiment initially, you have to click the Run and Randomize buttons again, to re-run with the new inputs.
After you run a sparse allocation design in FielDHub and set the dimensions of the field, there are several ways to display the information about the sparse process and the randomization.
On the first tab, Expt Design Info, you can see all the entries in the randomization displayed in a binary matrix with a column for each location, with a 1 indicating that the respective genotype is in the respective location, and a 0 indicating that it is not. This is the sparse genotype allocation characteristic of this method. There are buttons to copy, print, and save the table to an Excel file.
The Randomized Field tab displays a graphical representation of the randomization of the entries in a field of the specified dimensions. The checks are each colored uniquely, showing the number of times they are distributed throughout the field. The display includes numbered labels for the rows and columns. You can copy the field as a table or save it directly as an Excel file with the Copy and Excel buttons at the top.
In the Choose % of Checks: drop-down box, users can play with different options for the total amount of checks in the field.
On the Plot Number Field tab, there is a table display of the field with the plots numbered according to the Plot Order Layout specified, either serpentine or cartesian. You can see the corresponding entries for each plot number in the field book. Like the Randomized Field tab, you can copy the table or save it as an Excel file with the Copy and Excel buttons.
The Field Book displays all the information on the experimental design in a table format. It contains the specific plot number and the row and column address of each entry, as well as the corresponding treatment on that plot. This table is searchable, and we can filter the data in relevant columns.
FielDHub
function: sparse_allocation()
You can run the same design with a function in the FielDHub package, sparse_allocation()
.
First, you need to load the FielDHub
package typing,
library(FielDHub)
Then, you can enter the information describing the above design like this:
sparse_example <- sparse_allocation( lines = 260, l = 5, copies_per_entry = 4, checks = 4, plotNumber = c(1, 1001, 2001, 3001, 4001), locationNames = c("FARGO", "CASSELTON", "MINOT", "PROSPER", "WILLISTON"), exptName = "SparseTest2023", seed = 16 )
sparse_allocation()
above:lines = 260
is the number of genotypes.l = 5
is the number of locations.copies_per_entry = 4
is the number of copies of each entry.checks = 4
is the number of checks.plotNumber = c(1, 1001, 2001, 3001, 4001)
are optional starting plot numberslocationNames = c("FARGO", "CASSELTON", "MINOT", "PROSPER", "WILLISTON")
are optional names for each location.exptName = "SparseTest2023"
is an optional name of the experimentseed = 16
is the random seed number to replicate identical randomizations.sparse_example
objectTo print a summary of the information that is in the object sparse_example
, we can use the generic
function print()
.
The sparse_allocation()
function returns a list of objects, includes all the outputs from the function
diagonal_arrangement() and in addition list_locs
, allocation
, and size_locations
. The list_locs
object
is a list of data frames. Each data frame has two columns; ENTRY
and NAME
with the information to randomize
to each environment. The object allocation
is the binary allocation matrix of genotypes to locations,
and size_locations
is a data frame with a column for each location and a row indicating the size of
the location (number of field plots).
For example, we can display the allocation
object. Let us print the first ten genotypes allocation.
print(head(sparse_example$allocation, 10))
print(head(sparse_example$allocation, 10))
We can manipulate the sparse_allocation object as any other list in R. For example, we can print the design information as following:
print(sparse_example)
which outputs:
print(sparse_example)
sparse_example
objectThe object sparse_example
is a list consisting of all the information displayed in the output tabs in
the FielDHub app: design information, plot layout,
plot numbering, entries list, and field book, indexed by each location in the experiment. These are
accessible by the $
operator, i.e. designs$layoutRandom[[1]]
for LOC1
or designs$fieldBook
for
the whole field book.
designs$fieldBook
is a data frame containing information about every plot in the field, with
information about the location of the plot and the treatment in each plot. As seen in the output below,
the field book has columns for ID
, EXPT
, LOCATION
, YEAR
, PLOT
, ROW
, COLUMN
, CHECKS
,
ENTRY
, and TREATMENT
.
Let us see the first 10 rows of the field book for the first location in this experiment.
field_book <- sparse_example$fieldBook head(field_book, 10)
field_book <- sparse_example$fieldBook head(field_book, 10)
For plotting the layout in function of the coordinates ROW
and COLUMN
in the field book object we
can use the generic function plot()
as follows,
plot(sparse_example, l = 1)
The above plot is for LOC1
. We can plot any location in the experiment, like location 2 in this example:
plot(sparse_example, l = 2)
The figure above shows a map of an experiment randomized as an unreplicated arrangement design. The blue plots represent the unreplicated treatments, while the yellow-boxed colored check plots are replicated throughout the field in a systematic diagonal arrangement. The red plots with 0s are are fillers.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.