callFromMap: Call markers based on an existing map

Description Usage Arguments Details Value Examples

View source: R/callFromMap.R

Description

This function uses an existing genetic map to call genetic markers, including markers polymorphic on multiple chromosomes.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
callFromMap(
  rawData,
  thresholdChromosomes = 100,
  thresholdAlleleClusters = c(1e-10, 1e-20, 1e-30, 1e-40),
  maxChromosomes = 2,
  existingImputations,
  tDistributionPValue = 0.6,
  useOnlyExtraImputationPoints = TRUE,
  ...
)

Arguments

rawData

Raw data for a genetic marker.

thresholdChromosomes

The test-statistic threshold for declaring a marker to be polymorphic on a chromosome.

thresholdAlleleClusters

The p-value threshold for declaring two underlying founder alleles to have different marker alleles. Multiple possible values should be input.

maxChromosomes

The maximum number of chromosomes that a marker can be polymorphic on

existingImputations

An object of class mpcrossMapped from the mpMap2 package, containing data about imputed underlying genotypes.

tDistributionPValue

Paramater controlling the size of each detected cluster, ranging from 0 to 1. Small values result in small clusters, and large values result in large clusters.

useOnlyExtraImputationPoints

Should we only use the non-marker positions to identify the correct locations?

...

Extra arguments. Only existingLocalisationStatistics is supported, mostly so the example can run quickly.

Details

This function uses an existing genetic map to call a genetic marker. There are a number of advantages to this approach

1.

It can correctly call markers which are polymorphic on multiple chromosomes, therefore converting one marker into two.

2.

It avoids incorrectly calling markers polymorphic on multiple chromosomes. Incorrect calling can lead to supurious genetic interactions.

3.

It can call markers that initially appear to be monomorphic in the population.

4.

It can call additional marker alleles for markers that would otherwise be ignored.

Once a genetic map has been constructed, it should be used to impute underlying founder genotypes at an equally spaced grid of points using function imputeFounders. The steps in the algorithm are as follows:

1.

Determine which chromosomes the marker is associated to, and where on those chromosomes. This is determined using function addExtraMarkerFromRawCall, which is itself based on a manova model. The marker is assumed associated to chromosomes for which the test statistic is greater than thresholdChromosomes. An appropriate value for thresholdChromosomes can be determined by looking at the results of addExtraMarkerFromRawCall, for a number of different markers.

2.

Determine the distribution of marker alleles, at all the associated genetic locations. This is done by taking the founders to be the vertices of a graph, and connecting founders which seem to part of the same marker allele. The resulting graph should be a union of disjoint complete graphs (cliques).

3.

We now have a preliminary assignment of marker alleles to lines, where the assignment may be of 1, 2, 3 or more different marker alleles, depending on how many chromosomes the marker is associated with. For example, if the marker is associated with two chromosomes, then there will be two marker alleles for each line. For each unique combination of marker alleles, we take the lines which have that assignment of marker alleles, and fit a skew-t distribution.

4.

For each fitted distribution, determine a confidence region using p-value tDistributionPValue.

5.

Use these confidence regions to construct marker calls at each associated location.

Value

At the minimum, a list containing an entry called indicating whether the marker could be successfully called. If it could, other entries are returned.

overallAssignment

Defines clusters within the data.

classificationsPerPosition

Defines genotype calls per genetic location to which the marker was mapped.

clusterBoundaries

Contours giving the boundaries of each cluster in overallAssignment.

preliminaryGroups

The preliminary groups based on IBD imputations, which the final genotype calls are built from.

pValuesMatrices

The matrices of p-values used to form a graph, and therefore identify founder alleles.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
data(eightParentSubsetMap)
data(wsnp_Ku_rep_c103074_89904851)
data(callFromMapExampleLocalisationStatistics)
library(ggplot2)
library(gridExtra)
#We use an existing set of localisation statistics, to make the example faster
called <- callFromMap(rawData = as.matrix(wsnp_Ku_rep_c103074_89904851), existingImputations = 
    eightParentSubsetMap, useOnlyExtraImputationPoints = TRUE, tDistributionPValue = 0.8, 
    thresholdChromosomes = 80, existingLocalisationStatistics = existingLocalisationStatistics)
plotData <- wsnp_Ku_rep_c103074_89904851
plotData$genotype1B <- factor(called$classificationsPerPosition$Chr1BLoc31$finals)
plotData$imputed1B <- factor(imputationData(eightParentSubsetMap)[, "Chr1BLoc31"])
plotData$genotype1D <- factor(called$classificationsPerPosition$Chr1DLoc16$finals)
plotData$imputed1D <- factor(imputationData(eightParentSubsetMap)[, "Chr1DLoc16"])

plotImputations1B <- ggplot(plotData, mapping = aes(x = theta, y = r, color = imputed1B)) + 
    geom_point() + theme_bw() + ggtitle("Imputed genotype, 1B") + 
    guides(color=guide_legend(title="IBD genotype"))

called1B <- ggplot(plotData, mapping = aes(x = theta, y = r, color = genotype1B)) + 
    geom_point() + theme_bw() + ggtitle("Called genotype, 1B") + 
    guides(color=guide_legend(title="Called cluster")) + scale_color_manual(values = 
    c("black", RColorBrewer::brewer.pal(n = 4, name = "Set1")))

plotImputations1D <- ggplot(plotData, mapping = aes(x = theta, y = r, color = imputed1D)) + 
    geom_point() + theme_bw() + ggtitle("Imputed genotype, 1D") + 
    guides(color=guide_legend(title="IBD genotype"))

called1D <- ggplot(plotData, mapping = aes(x = theta, y = r, color = genotype1D)) + 
    geom_point() + theme_bw() + ggtitle("Called genotype, 1D") + 
    guides(color=guide_legend(title="Called cluster")) + 
    scale_color_manual(values = c("black",RColorBrewer::brewer.pal(n=3,name = "Set1")[1:2]))

grid.arrange(plotImputations1B, plotImputations1D, called1B, called1D)

rohan-shah/mpMap2 documentation built on July 21, 2020, 8:58 p.m.