genotypeInf: Genotyping of Illumina Infinium II arrays.

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/crlmm-illumina.R

Description

Genotyping of Illumina Infinium II arrays. This function provides CRLMM/KRLMM genotypes and confidence scores for the the polymorphic markers and is a required step prior to copy number estimation.

Usage

1
2
3
4
genotypeInf(cnSet, mixtureParams, probs = rep(1/3, 3), SNRMin = 5,
             recallMin = 10, recallRegMin = 1000, verbose = TRUE, returnParams = TRUE, 
             badSNP = 0.7, gender = NULL, DF = 6, cdfName, nopackage.norm="quantile",
             call.method="crlmm", trueCalls = NULL)

Arguments

cnSet

An object of class CNSet

mixtureParams

data.frame containing mixture model parameters needed for genotyping. The mixture model parameters are estimated from the preprocessInf function.

probs

'numeric' vector with priors for AA, AB and BB.

SNRMin

'numeric' scalar defining the minimum SNR used to filter out samples.

recallMin

Minimum number of samples for recalibration.

recallRegMin

Minimum number of SNP's for regression.

verbose

'logical.' Whether to print descriptive messages during processing.

returnParams

'logical'. Return recalibrated parameters from crlmm.

badSNP

'numeric'. Threshold to flag as bad SNP (affects batchQC)

gender

integer vector ( male = 1, female =2 ) or missing, with same length as filenames. If missing, the gender is predicted.

DF

'integer' with number of degrees of freedom to use with t-distribution.

cdfName

character string indicating which annotation package to load.

nopackage.norm

character string specifying normalization to be used when cdfName='nopackage'. Options are 'none', 'quantile' (within channel, between array) and 'quantileloess'.

call.method

character string specifying the genotype calling algorithm to use ('crlmm' or 'krlmm').

trueCalls

matrix specifying known Genotype calls for a subset of samples and features(1 - AA, 2 - AB, 3 - BB).

Details

The genotype calls and confidence scores are written to file using ff protocols for I/O. For the most part, the calls and confidence scores can be accessed as though the data is in memory through the methods snpCall and snpCallProbability, respectively.

The genotype calls are stored using an integer representation: 1 - AA, 2 - AB, 3 - BB. Similarly, the call probabilities are stored using an integer representation to reduce file size using the transformation 'round(-1000*log2(1-p))', where p is the probability. The function i2P can be used to convert the integers back to the scale of probabilities.

An optional trueCalls argument can be provided to KRLMM method which contains known genotype calls(can contain some NAs) for some samples and SNPs. This will used to compute KRLMM parameters by calling vglm function from VGAM package.

The KRLMM method makes use of functions provided in parallel package to speed up the process. It by default initialises up to 8 clusters. This is configurable by setting up an option named "krlmm.cores", e.g. options("krlmm.cores" = 16).

Value

Logical. If the genotyping is completed, the value 'TRUE' is returned. Note that assayData elements 'call' and 'callProbability' are updated on disk. Therefore, the genotypes and confidence scores can be retrieved using accessors for the CNSet class.

Author(s)

R. Scharpf

See Also

crlmm, snpCall, snpCallProbability, annotationPackages

Examples

1
2
	## See the 'illumina_copynumber' vignette in inst/scripts of
	## the source package

crlmm documentation built on Nov. 8, 2020, 4:55 p.m.