makeGRM | R Documentation |
Create a genomic relationship matix (GRM) object from an RA object, perform standard filtering and compute statistics required for constructing GRMs.
makeGRM(
RAobj,
samfile,
filter = list(MAF = NULL, MISS = NULL, BIN = 100, MAXDEPTH = 500)
)
RAobj |
Object of class RA created via the |
samfile |
Character string giving the name of the file that contains the sample information of the population. See below for details. |
filter |
Named list of thresholds for various filtering criteria. See below for details. |
This function converts an RA object into a GRM object. A GRM object is a R6 type object that contains RA data, various statistics related to GRM analyses and functions (methods) for analyzing GRMs.
The sample information as specified in the samfile
argument should be a csv file with the first column
giving the ID of the sample (and must match the IDs in the RA object supplied in the RAobj
argument) and
the second column giving the ploidy level of each individual. Additional columns can then be added that
give more information about the sample (e.g., cultivar, location, population). For example,
ID | Ploidy | Group |
HE109 | 2 | Wild |
PE202 | 4 | Wild |
PE243 | 4 | Domesticated |
In this example, there are three individuals, the first (HE109) is a diploid and belongs to the Wild group, the second individual (PE202) is a tetraploid that also belongs to the Wild group and the third individual (PE243) is also a tetraploid but belongs to the Domesticated group. Note that the names for the first two columns must be "ID" and "Ploidy" respectively, but any names can be used for the remaining columns but it is recommended to use meaningful names that don't have spaces. Remember that the first two are required, any extra columns are optional but can be used later.
The filtering criteria currently implemented are:
Minor allele frequency (MAF
): SNPs are discarded if their MAF is less than the threshold (default is NULL
)
Proportion of missing data (MISS
): SNPs are discarded if the proportion of individuals with no reads (e.g. missing genotype)
is greater than the threshold value (default is NULL
).
Bin size for SNP selection (BIN
): SNPs are binned together if the distance (in base pairs) between them is less than the threshold value (default is 100).
One SNP is then randomly selected from each bin and retained for final analysis. This filtering is to ensure that there is only one SNP on each sequence read.
Maximum average SNP depth (MAXDEPTH
): SNPs with an average read depth above the threshold value are discarded (default is 500).
If a filtering criteria is set to NULL
, then no filtering in regard to
that threshold is applied.
An R6 object of class GRM.
Timothy P. Bilton
GRM
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.