Description Usage Arguments Details Value Note Author(s) See Also Examples
Based on linkage information and a set of MSV-5 markers which have been split into individual paralogs within half-sib families, this function attempts to map the paralogs to their respective chromosomes and name them accordingly
1 2 3 4 5 6 7 8 9 10 | setMergeOptions(minC = NULL, noiseQuantile = 0.75,
offspringLim = 7, ratioLim = 0.9, rngLD = 5)
assignParalogues(BSSnp, BSRed,
paraCalls = unmixParalogues(BSRed, singleCalls),
inheritP = resolveInheritanceSNP(BSSnp),
singleCalls = getSingleCalls(BSRed),
cHits = locateParalogues(BSSnp, paraCalls, inheritP,
mO$offspringLim, mO$ratioLim)$cPerMarker,
mO = setMergeOptions())
|
minC |
A numeric value corresponding to the elements of |
noiseQuantile |
The quantile of the third largest chromosomes across markers from
which |
offspringLim |
In order for a match between a paralogue and a chromosome to be detected, the number of (informative) half-siblings must equal or exceed this numeric value |
ratioLim |
The patterns of paternal and maternal inherited alleles among
half-sib family offspring are compared between MSV-5 paralogs (see
|
rngLD |
Numeric indicating how many map-units (e.g. cM) to include on each side of the genetic map marker to increase the number of informative meioses and the power of the associations with the paralogs. |
BSSnp |
|
BSRed |
|
paraCalls |
List containing two matrices, “mother” and “father”, with the
parental inherited alleles of individual paralogs assuming unknown
alternate parent (see |
inheritP |
List containing two matrices, “mother” and “father”, with the
parental inherited alleles for the markers in |
singleCalls |
Matrix containing MSV-5s for which both paralogs are either
monomorphic or polymorphic (see |
cHits |
A three-dimensional array of size (markers x chromosomes x 2)
containing an average number of matches of a paralogue to a
chromsome for both the mothers and fathers (average across the
number of markers in the map for that chromosome; see
|
mO |
List with options used in the mapping of paralogs (see
|
While the function locateParalogues
allows for matching
of paralogs to any chromosome, assignParalogues
uses the former
output and limits the allowed choices to one or two chromosomes. The
paralogs are given names reflecting these chromosomes, which allows
for merging of the linkage information in paraCalls
into a
single, much more informative data-table.
Initially, the largest value of cHits
between “mother” and
“father” is chosen, and the resulting scores are sorted decreasingly
among chromosomes one marker at the time. Up to
two of the highest scoring chromosomes are selected if their values
exceed mO$minC
. If this element is NULL
, it will be
estimated based on the mO$noiseQuantile
'th quantile of the
third highest ranking chromosomes across markers. Also, the second
ranking chromosome will not be selected unless it scores twice as
high as the third ranking chromosome. Using the maternal and paternal
half-sib families in turn, each paralogue is mapped to either of the
selected chromosomes if sufficient association is detected.
For each half-sib family and each (informative) paralogue, only
genetic map markers for which the parent in question is heterozygous
are useful. This reduces the number of genetic map markers to which
the paralogs can be associated. Similarly, for only a subset of the
half-siblings are the parental inherited alleles in each paralogue
known. This tends to reduce the number of informative offspring in
each family drastically. Missing parental alleles among the genetic
map markers further reduce the numbers of informative offspring,
however these may sometimes be imputed using neighbouring markers
assumed to be in linkage disequilibrium (LD) with the marker in
question. The option rngLD
indirectly controls the number of
helping markers to use.
The mapping itself proceeds by applying the filter defined in
mO
to the genetic map markers on a specific chromosome (see
locateParalogues
for specifics about the filter). A set of
statistics are then calculated to find the marker that matches the
chromosome most closely. If there are two candidate chromosomes, the
one with the highest ranked marker is selected. If there is only one
candidate, it is selected if it outranks all the other chromosomes in
terms of the calculated statistics. If a succesfull match is found,
the parental inherited alleles for that family are assigned to the
paralogue whose name reflects the chromosome match.
A list containing
x |
a matrix holding the calls for those paralogs that are successfully mapped to a chromosome. The rownames reflect the chromosome as well as the marker-name |
chromPairs |
a matrix with 0, 1, or 2 chromosomes to which the MSV-5's have been succesfully mapped |
positionFemale |
a matrix holding the mapped paralogue positions as estimated by the female parent half-sib families |
positionMale |
a matrix holding the mapped paralogue positions as estimated by the male parent half-sib families |
This function may be time consuming, and even more so if many of the input parameters need be re-calculated each time. If some of them are available in the workspace, save time by including them in the function call
Lars Gidskehaug
plotCountsChrom
, setMergeOptions
,
unmixParalogues
, resolveInheritanceSNP
,
MultiSet
,
AlleleSetIllumina
,
locateParalogues
,
getSingleCalls
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ## Not run:
#Read markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],beadInfo=beadInfo)
#Genotype calling and splitting of MSV-5 paralogs
BSRed <- callGenotypes(BSRed)
BSRed <- validateCallsPedigree(BSRed)
iMSV5 <- fData(BSRed)$Classification %in% 'MSV-5' &
fData(BSRed)$Ped.Errors %in% 0
singleCalls <- getSingleCalls(BSRed[iMSV5,])
paraCalls <- unmixParalogues(BSRed[iMSV5,],singleCalls)
#Genetic map SNPs and inherited parental alleles
iSNP <- fData(BSRed)$Classification %in% 'SNP' &!
is.na(fData(BSRed)$Chromosome)
inheritP <- resolveInheritanceSNP(BSRed[iSNP,])
#Match paralogs with map
mO <- setMergeOptions(minC=1)
chromHits <- locateParalogues(BSRed[iSNP,],paraCalls,
inheritP,mO$offspringLim,mO$ratioLim)
#The example data and map are too small to detect most homeologies
plotCountsChrom(chromHits$cPerMarker,1:sum(iMSV5),at=1:15,
labels=dimnames(chromHits$c)[[2]],las=2)
#Only a few, single paralogs are succesfully assigned to chromosomes
mergedCalls <- assignParalogues(BSRed[iSNP,],BSRed[iMSV5],paraCalls,
inheritP,singleCalls,cHits=chromHits$cPerMarker,mO=mO)
print(mergedCalls$chromPairs)
print(mergedCalls$x[,1:4])
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.