translateTheta: Convert genotype calls to allele information

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/beadarrayMSV.R

Description

Genotype calls represented as numeric values (allele ratios) within [0, 1] are converted to character strings containing allele information “A”, “T”, “C”, and “G”

Usage

1
2
3
4
5
6
translateTheta(calls, resInfo, type = "regular")

translateThetaCombined(BSRed, mergedCalls = NULL)

translateThetaFromFiles(dataFiles, mergedCalls = NULL,
    markerStep = 1000, sep = "\t", quote = "")

Arguments

calls

Numeric matrix with calls {0, 1/2, 1} representing allele ratios for each sample. Each row is a unique marker or paralogue (specified with type)

resInfo

Data table containing featureData, including the columns “Classification”, “SNP”, and “ILMN.Strand”. These hold the genotype categories from callGenotypes and the SNP and TOP/BOT-category of the BeadArray markers (see createAlleleSet)

type

One of “regular”, “single”, or “merged” (see details below)

BSRed

"AlleleSetIllumina" object containing an assayData entry “call” and a featureData column “Classification” (see callGenotypes)

mergedCalls

Matrix with calls from resolved MSV-5 paralogs (see assignParalogues)

dataFiles

Character vector containing file names (see makeFilenames)

markerStep

The maximum number of markers loaded into the workspace at the time

sep

Field delimiter in text-files (see read.table)

quote

Quote-marks used for character strings (see read.table)

Details

The main difference between translateTheta and translateThetaCombined is that the former can only handle call-values {0, 1/2, 2} whereas the latter handles values {0, 1/4, 1/2, 3/4, 1}. In effect this means that markers from duplicated genome regions have to be handled in a special way if analysed with translateTheta. If type == "regular", the markers are treated as if they were all from a diploid region. This implies that all non-segregating paralogs of “MSV-a” and “MSV-b” markers are ignored, effectively turning these markers into SNPs. Markers classified as “MSV-5” or “PSV” are set to missing (see makeDiploidCalls). If type == "single", calls is expected to contain resolved “MSV-5” paralogs named with “-Para1” or “-Para2” (see unmixParalogues). If type == "merged", resolved “MSV-5” paralogs named according to their respective chromosomes, “-ChromX”, are expected (see assignParalogues). The main use of this function is to prepare genotype calls for mapping software which requires diploid markers.

With translateThetaCombined, there is always one element per marker, as required by the "AlleleSetIllumina". If mergedCalls is given, the “MSV-5” paralogs will be resolved, otherwise only the ratio of the alleles across paralogs will be returned. The function translateThetaFromFiles performs the same operations on data sequentially loaded into the workspace, and the genotypes are written to file dataFiles$genoFile as they are found.

Value

Output from translateTheta is a matrix whose dimensions depend on the input data. If calls has one row per marker (i.e. type == "regular"), the number of rows in the output matrix also equals the number of markers. If calls has one row per paralogue (i.e. type != "regular"), the number of rows in the output matrix also equals the number of paralogs. Each element is a character string “x y” denoting the two alleles (“A”, “T”, “C”, “G”, or “-” for missing).

In contrast, the output from translateThetaCombined is an AlleleSetIllumina object with an added assayData entry “genotype”. The elements of this matrix representing diploid markers are given as “xy”, un-resolved tetraploid markers are given as “xyzw”, and resolved tetraploid markers are given as “xy,zw” (paralogs separated by comma). The letters correspond to any of the 4 bases or “-” for missing.

The function translateThetaFromFiles are used for its side effects.

Author(s)

Lars Gidskehaug

See Also

makeDiploidCalls, unmixParalogues, assignParalogues, makeFilenames, callGenotypes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
#Read 25 markers into an AlleleSetIllumina object
rPath <- system.file("extdata", package="beadarrayMSV")
normOpts <- setNormOptions()
dataFiles <- makeFilenames('testdata',normOpts,rPath)
beadFile <- paste(rPath,'beadData_testdata.txt',sep='/')
beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE)
BSRed <- createAlleleSetFromFiles(dataFiles[1:4],markers=1:25,beadInfo=beadInfo)

#Genotype calling
BSRed <- callGenotypes(BSRed)
genotypes <- translateTheta(assayData(BSRed)$call,fData(BSRed),type='regular')
print(cbind(fData(BSRed)$Classification,genotypes[,1:3])[1:10,])

#Alternative output
BSRed <- translateThetaCombined(BSRed)
print(cbind(fData(BSRed)$Classification,assayData(BSRed)$genotype[,1:3])[1:10,])

## End(Not run)

beadarrayMSV documentation built on May 29, 2017, 9:07 a.m.