makeUR: Make an unrelated (UR) population
In tpbilton/GUSbase: Genotyping Uncertainty with Sequencing Data: Base Package

makeUR

R Documentation

Make an unrelated (UR) population

Description

Create an UR object from an RA object and perform standard filtering and compute statistics specific to unrelated populations.

Usage

makeUR(
  RAobj,
  ploid = 2,
  indsubset = NULL,
  filter = list(MAF = 0.01, MISS = 0.5, BIN = 100, HW = c(-0.05, Inf), MAXDEPTH = 500),
  mafEst = TRUE,
  nThreads = 2
)

Arguments

`RAobj`	Object of class RA created via the `readRA` function.
`ploid`	An integer number specifying the ploidy level of the population. Currently, only a ploidy level of two (diploid) is implemented.
`indsubset`	Integer vector specifying which samples of the RA dataset to retain in the UR population.
`filter`	Named list of thresholds for various criteria used to fiter SNPs. See below for details.
`mafEst`	Logical value indicating whether the allele frequences and sequencing error parameters are to estimated for each SNP (see details).
`nThreads`	Integer vector specifying the number of clusters to use in the foreach loop. Only used in the estimation of allele frequencies when `mafEst=TRUE`.

Details

If mafEst=TRUE, then the major allele frequency and sequencing error rate for each SNP is estimated based on optimizing the likelihood

P(Y=a) = \sum_{G} P(Y=a|G)P(G)

where P(G) are genotype probabilities under Hardy Weinberg Equilibrium (HWE) and P(Y=a|G) are the probilities given in Equation (5) of \insertCitebilton2018genetics2;textualGUSbase. Otherwise, the allele frequencies are taken as the mean of the allele ratio (defined as the number of reference reads divided by the total number of reads) and the sequencing error rate is assumed to be zero.

The filtering criteria currently implemented are

Minor allele frequency (MAF): SNPs are discarded if their MAF is less than the threshold (default is 0.01)
Proportion of missing data (MISS): SNPs are discarded if the proportion of individuals with no reads (e.g. missing genotype) is greater than the threshold value (default is 0.5)
Bin size for SNP selection (BIN):SNPs are binned together if the distance (in base pairs) between them is less than the threshold value (default is 100). One SNP is then randomly selected from each bin and retained for final analysis. This filtering is to ensure that there is only one SNP on each sequence read.
Hardy Weinberg Distance (HW): SNPs are discarded if their Hardy Weinberg distance is less than the first threshold value (default=-0.05) or if their Hardy Weinberg distance is greater than the second threshold value (default=Inf). This filtering criteria has been taken from the KGD software (https://github.com/AgResearch/KGD).
Maximum average SNP read depth (MAXDEPTH): SNPs are discarded if the average read depth for the SNP is larger than the threshold (default is 500)

If filter = NULL, then no filtering is performed.

Estimation of the allele frequencies when mafEst=TRUE is parallelized using openMP in compiled C code, where the number of threads used in the parallelization is specified by the argument nThreads.

Value

An R6 object of class UR.

Author(s)

Timothy P. Bilton and Ken G. Dodds

References

\insertRef

bilton2018genetics2GUSbase

Examples

file <- simDS()
RAfile <- VCFtoRA(file$vcf)
simdata <- readRA(RAfile)

## make unrelated population
urpop <- makeUR(simdata)

tpbilton/GUSbase documentation built on March 8, 2024, 1:35 p.m.

tpbilton/GUSbase index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tpbilton/GUSbase
Genotyping Uncertainty with Sequencing Data: Base Package

makeUR: Make an unrelated (UR) population
In tpbilton/GUSbase: Genotyping Uncertainty with Sequencing Data: Base Package

Make an unrelated (UR) population

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to makeUR in tpbilton/GUSbase...

R Package Documentation

Browse R Packages

We want your feedback!

tpbilton/GUSbase Genotyping Uncertainty with Sequencing Data: Base Package

makeUR: Make an unrelated (UR) population In tpbilton/GUSbase: Genotyping Uncertainty with Sequencing Data: Base Package

Make an unrelated (UR) population

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to makeUR in tpbilton/GUSbase...

R Package Documentation

Browse R Packages

We want your feedback!

tpbilton/GUSbase
Genotyping Uncertainty with Sequencing Data: Base Package

makeUR: Make an unrelated (UR) population
In tpbilton/GUSbase: Genotyping Uncertainty with Sequencing Data: Base Package