genoprob_to_snpprob: Convert genotype probabilities to SNP probabilities

View source: R/genoprob_to_snpprob.R

genoprob_to_snpprobR Documentation

Convert genotype probabilities to SNP probabilities

Description

For multi-parent populations, convert use founder genotypes at a set of SNPs to convert founder-based genotype probabilities to SNP genotype probabilities.

Usage

genoprob_to_snpprob(genoprobs, snpinfo)

Arguments

genoprobs

Genotype probabilities as calculated by calc_genoprob().

snpinfo

Data frame with SNP information with the following columns (the last three are generally derived with index_snps()):

  • chr - Character string or factor with chromosome

  • pos - Position (in same units as in the "map" attribute in genoprobs.

  • sdp - Strain distribution pattern: an integer, between 1 and 2^n - 2 where n is the number of strains, whose binary encoding indicates the founder genotypes

  • snp - Character string with SNP identifier (if missing, the rownames are used).

  • index - Indices that indicate equivalent groups of SNPs, calculated by index_snps().

  • intervals - Indexes that indicate which marker intervals the SNPs reside.

  • on_map - Indicate whether SNP coincides with a marker in the genoprobs

Alternatively, snpinfo can be a object of class "cross2", as output by read_cross2(), containing the data for a multi-parent population with founder genotypes, in which case the SNP information for all markers with complete founder genotype data is calculated and then used. But, in this case, the genotype probabilities must be at the markers in the cross.

Details

We first split the SNPs by chromosome and use snpinfo$index to subset to non-equivalent SNPs. snpinfo$interval indicates the intervals in the genotype probabilities that contain each. For SNPs contained within an interval, we use the average of the probabilities for the two endpoints. We then collapse the probabilities according to the strain distribution pattern.

Value

An object of class "calc_genoprob", like the input genoprobs, but with imputed genotype probabilities at the selected SNPs indicated in snpinfo$index. See calc_genoprob().

If the input genoprobs is for allele probabilities, the probs output has just two probability columns (for the two SNP alleles). If the input has a full set of n(n+1)/2 probabilities for n strains, the probs output has 3 probabilities (for the three SNP genotypes). If the input has full genotype probabilities for the X chromosome (n(n+1)/2 genotypes for the females followed by n hemizygous genotypes for the males), the output has 5 probabilities: the 3 female SNP genotypes followed by the two male hemizygous SNP genotypes.

See Also

index_snps(), calc_genoprob(), scan1snps()

Examples

## Not run: 
# load example data and calculate genotype probabilities
file <- paste0("https://raw.githubusercontent.com/rqtl/",
               "qtl2data/main/DO_Recla/recla.zip")
recla <- read_cross2(file)
recla <- recla[c(1:2,53:54), c("19","X")] # subset to 4 mice and 2 chromosomes
probs <- calc_genoprob(recla, error_prob=0.002)

# founder genotypes for a set of SNPs
snpgeno <- rbind(m1=c(3,1,1,3,1,1,1,1),
                 m2=c(1,3,1,3,1,3,1,3),
                 m3=c(1,1,1,1,3,3,3,3),
                 m4=c(1,3,1,3,1,3,1,3))
sdp <- calc_sdp(snpgeno)
snpinfo <- data.frame(chr=c("19", "19", "X", "X"),
                      pos=c(40.36, 40.53, 110.91, 111.21),
                      sdp=sdp,
                      snp=c("m1", "m2", "m3", "m4"), stringsAsFactors=FALSE)

# identify groups of equivalent SNPs
snpinfo <- index_snps(recla$pmap, snpinfo)

# collapse to SNP genotype probabilities
snpprobs <- genoprob_to_snpprob(probs, snpinfo)

# could also first convert to allele probs
aprobs <- genoprob_to_alleleprob(probs)
snpaprobs <- genoprob_to_snpprob(aprobs, snpinfo)

## End(Not run)


rqtl/qtl2 documentation built on Nov. 28, 2024, 4:57 a.m.