assignParalogues: Assign MSV-5 paralogs to chromosomes

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/beadarrayMSV.R

Description

Based on linkage information and a set of MSV-5 markers which have been split into individual paralogs within half-sib families, this function attempts to map the paralogs to their respective chromosomes and name them accordingly

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
setMergeOptions(minC = NULL, noiseQuantile = 0.75,
    offspringLim = 7, ratioLim = 0.9, rngLD = 5)

assignParalogues(BSSnp, BSRed,
    paraCalls = unmixParalogues(BSRed, singleCalls),
    inheritP = resolveInheritanceSNP(BSSnp),
    singleCalls = getSingleCalls(BSRed),
    cHits = locateParalogues(BSSnp, paraCalls, inheritP,
        mO$offspringLim, mO$ratioLim)$cPerMarker,
    mO = setMergeOptions())

Arguments

minC

A numeric value corresponding to the elements of cHits below which no chromosomes are detected

noiseQuantile

The quantile of the third largest chromosomes across markers from which minC may be estimated

offspringLim

In order for a match between a paralogue and a chromosome to be detected, the number of (informative) half-siblings must equal or exceed this numeric value

ratioLim

The patterns of paternal and maternal inherited alleles among half-sib family offspring are compared between MSV-5 paralogs (see unmixParalogues) and genetic map SNPs (see resolveInheritanceSNP). The ratio of matching allele patterns between the two must equal or exceed this numeric value in order for a chromosome match to be detected

rngLD

Numeric indicating how many map-units (e.g. cM) to include on each side of the genetic map marker to increase the number of informative meioses and the power of the associations with the paralogs.

BSSnp

"AlleleSetIllumina" (or "MultiSet") object containing SNPs of known location on the chromosomes, including an assayData entry “call”. A phenoData column PedigreeID must be included on the form <p><mmm><fff><oo>, identifying the population, mother, father and individual offspring, respectively. The featureData variables “Chromosome”, “Female”, and “Male” give the numbered chromosome and the genetic distances on the female and male map, respectively

BSRed

"AlleleSetIllumina" (or "MultiSet") object containing MSV-5's to be mapped, with a required assayData-list entry “call”. Must contain the same samples as BSSnp, and also a phenoData column “PedigreeID”

paraCalls

List containing two matrices, “mother” and “father”, with the parental inherited alleles of individual paralogs assuming unknown alternate parent (see unmixParalogues)

inheritP

List containing two matrices, “mother” and “father”, with the parental inherited alleles for the markers in BSSnp (see resolveInheritanceSNP)

singleCalls

Matrix containing MSV-5s for which both paralogs are either monomorphic or polymorphic (see getSingleCalls)

cHits

A three-dimensional array of size (markers x chromosomes x 2) containing an average number of matches of a paralogue to a chromsome for both the mothers and fathers (average across the number of markers in the map for that chromosome; see locateParalogues)

mO

List with options used in the mapping of paralogs (see setMergeOptions)

Details

While the function locateParalogues allows for matching of paralogs to any chromosome, assignParalogues uses the former output and limits the allowed choices to one or two chromosomes. The paralogs are given names reflecting these chromosomes, which allows for merging of the linkage information in paraCalls into a single, much more informative data-table.

Initially, the largest value of cHits between “mother” and “father” is chosen, and the resulting scores are sorted decreasingly among chromosomes one marker at the time. Up to two of the highest scoring chromosomes are selected if their values exceed mO$minC. If this element is NULL, it will be estimated based on the mO$noiseQuantile'th quantile of the third highest ranking chromosomes across markers. Also, the second ranking chromosome will not be selected unless it scores twice as high as the third ranking chromosome. Using the maternal and paternal half-sib families in turn, each paralogue is mapped to either of the selected chromosomes if sufficient association is detected.

For each half-sib family and each (informative) paralogue, only genetic map markers for which the parent in question is heterozygous are useful. This reduces the number of genetic map markers to which the paralogs can be associated. Similarly, for only a subset of the half-siblings are the parental inherited alleles in each paralogue known. This tends to reduce the number of informative offspring in each family drastically. Missing parental alleles among the genetic map markers further reduce the numbers of informative offspring, however these may sometimes be imputed using neighbouring markers assumed to be in linkage disequilibrium (LD) with the marker in question. The option rngLD indirectly controls the number of helping markers to use.

The mapping itself proceeds by applying the filter defined in mO to the genetic map markers on a specific chromosome (see locateParalogues for specifics about the filter). A set of statistics are then calculated to find the marker that matches the chromosome most closely. If there are two candidate chromosomes, the one with the highest ranked marker is selected. If there is only one candidate, it is selected if it outranks all the other chromosomes in terms of the calculated statistics. If a succesfull match is found, the parental inherited alleles for that family are assigned to the paralogue whose name reflects the chromosome match.

Value

A list containing

x

a matrix holding the calls for those paralogs that are successfully mapped to a chromosome. The rownames reflect the chromosome as well as the marker-name

chromPairs

a matrix with 0, 1, or 2 chromosomes to which the MSV-5's have been succesfully mapped

positionFemale

a matrix holding the mapped paralogue positions as estimated by the female parent half-sib families

positionMale

a matrix holding the mapped paralogue positions as estimated by the male parent half-sib families

Note

This function may be time consuming, and even more so if many of the input parameters need be re-calculated each time. If some of them are available in the workspace, save time by including them in the function call

Author(s)

Lars Gidskehaug

See Also

plotCountsChrom, setMergeOptions, unmixParalogues, resolveInheritanceSNP, MultiSet, AlleleSetIllumina, locateParalogues, getSingleCalls

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
## Not run: 
#Read markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],beadInfo=beadInfo)

#Genotype calling and splitting of MSV-5 paralogs
BSRed <- callGenotypes(BSRed)
BSRed <- validateCallsPedigree(BSRed)
iMSV5 <- fData(BSRed)$Classification %in% 'MSV-5' &
    fData(BSRed)$Ped.Errors %in% 0
singleCalls <- getSingleCalls(BSRed[iMSV5,])
paraCalls <- unmixParalogues(BSRed[iMSV5,],singleCalls)

#Genetic map SNPs and inherited parental alleles
iSNP <- fData(BSRed)$Classification %in% 'SNP' &!
    is.na(fData(BSRed)$Chromosome)
inheritP <- resolveInheritanceSNP(BSRed[iSNP,])

#Match paralogs with map
mO <- setMergeOptions(minC=1)
chromHits <- locateParalogues(BSRed[iSNP,],paraCalls,
   inheritP,mO$offspringLim,mO$ratioLim)

#The example data and map are too small to detect most homeologies
plotCountsChrom(chromHits$cPerMarker,1:sum(iMSV5),at=1:15,
   labels=dimnames(chromHits$c)[[2]],las=2)

#Only a few, single paralogs are succesfully assigned to chromosomes
mergedCalls <- assignParalogues(BSRed[iSNP,],BSRed[iMSV5],paraCalls,
   inheritP,singleCalls,cHits=chromHits$cPerMarker,mO=mO)
print(mergedCalls$chromPairs)
print(mergedCalls$x[,1:4])

## End(Not run)

beadarrayMSV documentation built on May 1, 2019, 6:33 p.m.