makeGRM: Make a GRM object

View source: R/makeGRM.R

makeGRMR Documentation

Make a GRM object

Description

Create a genomic relationship matix (GRM) object from an RA object, perform standard filtering and compute statistics required for constructing GRMs.

Usage

makeGRM(
  RAobj,
  samfile,
  filter = list(MAF = NULL, MISS = NULL, BIN = 100, MAXDEPTH = 500)
)

Arguments

RAobj

Object of class RA created via the readRA function.

samfile

Character string giving the name of the file that contains the sample information of the population. See below for details.

filter

Named list of thresholds for various filtering criteria. See below for details.

Details

This function converts an RA object into a GRM object. A GRM object is a R6 type object that contains RA data, various statistics related to GRM analyses and functions (methods) for analyzing GRMs.

The sample information as specified in the samfile argument should be a csv file with the first column giving the ID of the sample (and must match the IDs in the RA object supplied in the RAobj argument) and the second column giving the ploidy level of each individual. Additional columns can then be added that give more information about the sample (e.g., cultivar, location, population). For example,

ID Ploidy Group
HE109 2 Wild
PE202 4 Wild
PE243 4 Domesticated

In this example, there are three individuals, the first (HE109) is a diploid and belongs to the Wild group, the second individual (PE202) is a tetraploid that also belongs to the Wild group and the third individual (PE243) is also a tetraploid but belongs to the Domesticated group. Note that the names for the first two columns must be "ID" and "Ploidy" respectively, but any names can be used for the remaining columns but it is recommended to use meaningful names that don't have spaces. Remember that the first two are required, any extra columns are optional but can be used later.

The filtering criteria currently implemented are:

  • Minor allele frequency (MAF): SNPs are discarded if their MAF is less than the threshold (default is NULL)

  • Proportion of missing data (MISS): SNPs are discarded if the proportion of individuals with no reads (e.g. missing genotype) is greater than the threshold value (default is NULL).

  • Bin size for SNP selection (BIN): SNPs are binned together if the distance (in base pairs) between them is less than the threshold value (default is 100). One SNP is then randomly selected from each bin and retained for final analysis. This filtering is to ensure that there is only one SNP on each sequence read.

  • Maximum average SNP depth (MAXDEPTH): SNPs with an average read depth above the threshold value are discarded (default is 500).

If a filtering criteria is set to NULL, then no filtering in regard to that threshold is applied.

Value

An R6 object of class GRM.

Author(s)

Timothy P. Bilton

See Also

GRM


tpbilton/GUSrelate documentation built on Feb. 20, 2025, 4:35 p.m.