makeIC: Make a intercross (IC) population

View source: R/makeIC.R

makeICR Documentation

Make a intercross (IC) population

Description

Create an IC object from an RA object, perform standard filtering and compute statistics specific to intercross populations.

Usage

makeIC(
  RAobj,
  samID = NULL,
  filter = list(MAF = 0.05, MISS = 0.2, BIN = 100, DEPTH = 5, PVALUE = 0.01, MAXDEPTH =
    500)
)

Arguments

RAobj

Object of class RA created via the readRA function.

samID

Character vector of the sampleIDs corresponding to the individuals in the intercross population. If NULL, then it is assumed that all individuals in the dataset belong to the intercross population.

filter

Named list of thresholds for various filtering criteria. See below for details.

family

Vector of character strings giving the families to retain in the IC object. This allows a pedigree file with more than one family to be supplied.

Details

This function converts an RA object into an IC (intercross) object. An IC object is a R6 type object that contains RA data, various other statistics computed and functions (methods) for analyzing and performing linkage mapping for intercross population. The statistics computed and data filtering are specific to intercross populations and sequencing data.

The filtering criteria currently implemented are:

  • Minor allele frequency (MAF): SNPs are discarded if their MAF is less than the threshold (default is 0.05)

  • Proportion of missing data (MISS): SNPs are discarded if the proportion of individuals with no reads (e.g. missing genotype) is greater than the threshold value (default is 0.5).

  • Bin size for SNP selection (BIN): SNPs are binned together if the distance (in base pairs) between them is less than the threshold value (default is 100). One SNP is then randomly selected from each bin and retained for final analysis. This filtering is to ensure that there is only one SNP on each sequence read.

  • Parental read depth (DEPTH): SNPs are discarded if the read depth of either parent is less than the threshold value (default is 5). This filter is to remove SNPs where the parental information is insufficient to infer segregation type accurately.

  • Segregation test P-value (PVALUE): SNPs are discarded if the p-value from a segregation test is smaller than the threshold (default is 0.01). This filters out SNPs where the segregation type has been inferred wrong.

The segregation type of each SNP is inferred based on the genotypes of the parents. The parental genotypes are called homozygous for the reference allele if there is only reference reads seen, heterozygous if at least one read for the reference and alternate allele are seen, and homozygous for the alternate allele if only reads for the alternate allele are seen. as a result, the parental genotype may be incorrectly inferred if the read depth is too low (e.g., homozeygous genotype is called heterozygous) and hence why the DEPTH filter is implemented. The segregation test performed for the PVALUE filter is described in the supplementary methods of the publication by \insertCitebilton2018genetics1;textualGUSMap (Section 4 of File S1).

Note: Only a single intercross family can be processed at present. There are future plans to extend this out to include multiple families.

Value

An R6 object of class IC

Author(s)

Timothy P. Bilton

References

\insertRef

bilton2018genetics1GUSMap

Examples

## extract filename for Manuka dataset in GUSMap package
vcffile <- Manuka11()

## Convert VCF to RA format
rafile <- VCFtoRA(vcffile$vcf)

## read in the RA data
mkdata <- readRA(rafile)

## Extract IDs that correspond to individuals in the IC population
progeny = RAdata$extractVar("indID")$indID[-c(1:4)]

## Create the IC population
makeIC(mkdata, samID=progeny)

tpbilton/GUSMap documentation built on Feb. 22, 2025, 12:27 p.m.