estGeno: Genotype estimation using a hiden Morkov model
In tomoyukif/GBScleanR: Error correction tool for noisy genotyping by sequencing (GBS) data

estGeno

R Documentation

Genotype estimation using a hiden Morkov model

Description

Clean up genotype data by error correction based on genotype estimation using a hidden Markov model.

Usage

estGeno(
  object,
  node = "raw",
  recomb_rate = 0.04,
  error_rate = 0.0025,
  call_threshold = 0.9,
  het_parent = FALSE,
  optim = TRUE,
  iter = 4,
  n_threads = 1,
  dummy_reads = 5,
  ...
)

## S4 method for signature 'GbsrGenotypeData'
estGeno(
  object,
  node,
  recomb_rate,
  error_rate,
  call_threshold,
  het_parent,
  optim,
  iter,
  n_threads,
  dummy_reads
)

Arguments

`object`	A GbsrGenotypeData object.
`node`	Either "raw" or "filt" to indicate whether raw or filtered read counts are used for genotype estimation. See `setCallFilter()` for the details of filtered read counts.
`recomb_rate`	A numeric value to indicate the expected recombination frequency per chromosome per megabase pairs.
`error_rate`	A numeric value of the expected sequence error rate.
`call_threshold`	A numeric value of the probability threshold to accept estimated genotype calls.
`het_parent`	A logical value to indicate whether parental samples are outbred or inbred. If FALSE, this function assume all true genotype of markers in parents are homozygotes.
`optim`	A logical value to specify whether to conduct parameter optimization for error correction.
`iter`	An integer value to specify the number of iterative parameter updates.
`n_threads`	An integer value to specify the number of threads used for the calculation. The default is 1 and if `n_threads = NULL`, automatically set half the number of available threads on the computer.
`dummy_reads`	An integer to specify the number of dummy reads to assign to dummy parental samples for genotype estimation. See details.
`...`	Unused.

Details

If you have not set parental samples by setParents() and initialized the scheme object using initScheme(), you have the scheme object without explicit parental information that is assumed to be a bi-parental population. In this case, estGeno() will run in the parentless mode. In the parentless mode, the algorithm assumes that the given population is a bi-parental population. The number of reference allele reads and the number of alternative allele reads of the dummy parents are set based on dummy_reads, respectively. Dummy parent 1 has dummy_reads of the reference allele reads and 0 alternative allele reads at all markers, while dummy parent 2 has 0 and dummy_reads of reference and alternative allele reads at all markers. If the parents of your population were outbred lines or you cannot assume one of the parents has completely reference homozygotes and another has laternative homozygotes at all markers, Set dummy_reads = 0 to leave uncertainty to estimate parental genotypes based on the offspring genotypes. Nevertheless, the parentless mode is less accurate and has more chance to get a genotype estimate randomly selected from the equally likely genotype estimates.

Value

A GbsrGenotypeData object in which the "estimated.haplotype", "corrected.genotype" and "parents.genotype" nodes were added.

Examples

# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds_fn <- system.file("extdata", "sample.gds", package = "GBScleanR")
gds <- loadGDS(gds_fn)

# Find the IDs of parental samples.
parents <- grep("Founder", getSamID(gds), value = TRUE)

# Set the parents and flip allele information
# if the reference sample (Founder1 in our case) has homozygous
# alternative genotype at some markers of which alleles will
# be swapped to make the reference sample have homozygous
# reference genotype.
gds <- setParents(gds, parents = parents)

# Initialize a scheme object stored in the slot of the GbsrGenotypeData.
# We chose `crosstype = "pair"` because two inbred founders were mated
# in our breeding scheme.
# We also need to specify the mating matrix which has two rows and
# one column with integers 1 and 2 indicating a sample (founder)
# with the memberID 1 and a sample (founder) with the memberID 2
# were mated.
gds <- initScheme(gds, mating = cbind(c(1:2)))

# Add information of the next cross conducted in our scheme.
# We chose 'crosstype = "selfing"', which do not require a
# mating matrix.
gds <- addScheme(gds, crosstype = "selfing")

# Execute error correction by estimating genotype and haplotype of
# founders and offspring.
gds <- estGeno(gds)

# Close the connection to the GDS file.
closeGDS(gds)

tomoyukif/GBScleanR documentation built on June 1, 2025, 10:13 p.m.

tomoyukif/GBScleanR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tomoyukif/GBScleanR
Error correction tool for noisy genotyping by sequencing (GBS) data

estGeno: Genotype estimation using a hiden Morkov model
In tomoyukif/GBScleanR: Error correction tool for noisy genotyping by sequencing (GBS) data

Genotype estimation using a hiden Morkov model

Description

Usage

Arguments

Details

Value

Examples

Related to estGeno in tomoyukif/GBScleanR...

R Package Documentation

Browse R Packages

We want your feedback!

tomoyukif/GBScleanR Error correction tool for noisy genotyping by sequencing (GBS) data

estGeno: Genotype estimation using a hiden Morkov model In tomoyukif/GBScleanR: Error correction tool for noisy genotyping by sequencing (GBS) data

Genotype estimation using a hiden Morkov model

Description

Usage

Arguments

Details

Value

Examples

Related to estGeno in tomoyukif/GBScleanR...

R Package Documentation

Browse R Packages

We want your feedback!

tomoyukif/GBScleanR
Error correction tool for noisy genotyping by sequencing (GBS) data

estGeno: Genotype estimation using a hiden Morkov model
In tomoyukif/GBScleanR: Error correction tool for noisy genotyping by sequencing (GBS) data