locateParalogues: Match paralogs with chromosomes

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/beadarrayMSV.R

Description

Matches patterns of parental inherited alleles within half-siblings between MSV-5 paralogs and genetic map SNPs. The matches for each MSV-5 marker are summed in order to enable mapping of the paralogs to individual chromosomes

Usage

1
2
3
4
locateParalogues(BSSnp, paraCalls, inheritP, offspringLim = 7,
    ratioLim = 0.9)

plotCountsChrom(chromHits, markers = 1:16, ...)

Arguments

BSSnp

"AlleleSetIllumina" (or "MultiSet") object containing only SNP markers, with an assayData entry “call” (see callGenotypes) and a phenoData column “PedigreeID”. The latter contains strings <p><mmm><fff><oo>, where “p”, “mmm”, “fff”, and “oo” are unique identifiers for population, mother, father, and individual within full-sib group, respectively. “000” means founding parent, whereas “999” means unknown parent. It must also contain a featureData column “Chr.Name”

paraCalls

List with two matrix elements “father” and “mother” containing paralogue calls representing paternal and maternal inherited alleles in offspring, respectively (see unmixParalogues)

inheritP

List with two matrix elements “father” and “mother” containing genetic map marker calls representing paternal and maternal inherited alleles in offspring, respectively (see resolveInheritanceSNP)

offspringLim

In order for a match between a paralogue and a chromosome to be detected, the number of (informative) half-siblings must equal or exceed this numeric value (see setMergeOptions)

ratioLim

The patterns of paternal and maternal inherited alleles among half-sib family offspring are compared between MSV-5 paralogs and genetic map SNPs. The ratio of matching allele patterns between the two must equal or exceed this numeric value in order for a chromosome match to be detected (see setMergeOptions)

chromHits

A numeric array of size (markers x chromosomes x 2) with the average number of matches per chromosome for mothers and fathers separately. Part of the output from locateParalogues.

markers

Index to subset of MSV-5 markers to plot

...

Additional arguments to axis, to be used on the x-axes

Details

The individual paralogs in paraCalls are associated with the genetic map markers in inheritP. If a matching offspring is registered each time an informative allele in the paralogue corresponds with an informative allele in the mapped marker, the degree of association between the two is determined by counting the number of matches. It is not known whether an “A”-allele in the paralogue matches with an “A”- or “B”-allele in the tested marker, but the the combination that produces the highest number of matches is assumed. This means that any pattern of random mis-matches is equally probable as the same number of matches for two unlinked loci. The chance of linkage being falsely declared between two loci however decreases as the number and ratio of matches increase. Associations supported by too few informative meioses are therefore filtered away.

There is a 50% chance of inheriting either allele (“A” or “B”) at any segregating locus, which means that a single match is produced by chance 50% of the times. This gives for instance a 6/2^5 (19%) probability that the alleles of two unlinked loci will match for four out of five offspring. Also, as we cannot tell mis-matches from matches, the probability of a false detection is doubled. As such a filter would yield far to many false positives, we need to reduce the probability of random associations further. The default filter counts only markers with at least offspringLim=7 informative meioses and at least ratioLim=90% matches/mis-matches to the paralogue. This threshold implies a random false positive match will occur in 2*11/2^10 (2,1%) of the tests. The total number of matches across markers within each chromosome is divided by the number of tested markers, such that the chromosomes with the highest average number of matches can be found.

The plots produced by plotCountsChrom visualize the average scores produced by locateParalogues. A red (fathers) and black (mothers) line is plotted for each MSV-5 marker, with one or two peaks indicating the chromosome(s) the paralogs map to.

Value

The function locateParalogues returns a list with elements

cPerMarker

A numeric array of size (markers x chromosomes x 2) with the average number of matches per chromosome for mothers and fathers separately

nCountsTot

Matrix of size (markers x 2) with the total sum of matches per marker for mothers and fathers


plotCountsChrom is used for its side effects

Author(s)

Lars Gidskehaug

See Also

plotCountsChrom, setMergeOptions, unmixParalogues, resolveInheritanceSNP, MultiSet, AlleleSetIllumina, assignParalogues

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## Not run: 
#Read markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],beadInfo=beadInfo)

#Genotype calling and splitting of MSV-5 paralogs
BSRed <- callGenotypes(BSRed)
BSRed <- validateCallsPedigree(BSRed)
iMSV5 <- fData(BSRed)$Classification %in% 'MSV-5' &
    fData(BSRed)$Ped.Errors %in% 0
paraCalls <- unmixParalogues(BSRed[iMSV5,])

#Genetic map SNPs and inherited parental alleles
iSNP <- fData(BSRed)$Classification %in% 'SNP' &!
    is.na(fData(BSRed)$Chromosome)
inheritP <- resolveInheritanceSNP(BSRed[iSNP,])

#Match paralogs with map
chromHits <- locateParalogues(BSRed[iSNP,],paraCalls,inheritP)

#The example data and map are too small to detect most homeologies
plotCountsChrom(chromHits$cPerMarker,1:sum(iMSV5),at=1:15,
    labels=dimnames(chromHits$c)[[2]],las=2)

## End(Not run)

beadarrayMSV documentation built on May 1, 2019, 6:33 p.m.