preprocessInf: Preprocessing of Illumina Infinium II arrays.

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/crlmm-illumina.R

Description

This function normalizes the intensities for the 'A' and 'B' alleles for a CNSet object and estimates mixture parameters used for subsequent genotyping. See details for how the normalized intensities are written to file. This step is required for subsequent genotyping and copy number estimation.

Usage

1
2
3
4
5
6
preprocessInf(cnSet, sampleSheet=NULL, arrayNames = NULL, ids = NULL,
path = ".", arrayInfoColNames = list(barcode = "SentrixBarcode_A",
position = "SentrixPosition_A"), highDensity = TRUE, sep = "_", fileExt
= list(green = "Grn.idat", red = "Red.idat"), XY, anno, saveDate = TRUE, stripNorm
= TRUE, useTarget = TRUE, mixtureSampleSize = 10^5, fitMixture = TRUE, 
quantile.method="between", eps = 0.1, verbose = TRUE, seed = 1, cdfName)

Arguments

cnSet

object of class CNSet

sampleSheet

data.frame containing Illumina sample sheet information (for required columns, refer to BeadStudio Genotyping guide - Appendix A).

arrayNames

character vector containing names of arrays to be read in. If NULL, all arrays that can be found in the specified working directory will be read in.

ids

vector containing ids of probes to be read in. If NULL all probes found on the first array are read in.

path

character string specifying the location of files to be read by the function

arrayInfoColNames

(used when sampleSheet is specified) list containing elements 'barcode' which indicates column names in the sampleSheet which contains the arrayNumber/barcode number and 'position' which indicates the strip number. In older style sample sheets, this information is combined (usually in a column named 'SentrixPosition') and this should be specified as list(barcode=NULL, position="SentrixPosition")

highDensity

logical (used when sampleSheet is specified). If TRUE, array extensions '\_A', '\_B' in sampleSheet are replaced with 'R01C01', 'R01C02' etc.

sep

character string specifying separator used in .idat file names.

fileExt

list containing elements 'Green' and 'Red' which specify the .idat file extension for the Cy3 and Cy5 channels.

XY

an NChannelSet object containing X and Y intensities.

anno

data.frame containing SNP annotation information from manifest and additional columns 'isSnp', 'position', 'chromosome' and 'featureNames'. For use when cdfName='nopackage'

saveDate

'logical'. Should the dates from each .idat be saved with sample information?

stripNorm

'logical'. Should the data be strip-level normalized?

useTarget

'logical' (only used when stripNorm=TRUE). Should the reference HapMap intensities be used in strip-level normalization?

mixtureSampleSize

Sample size to be use when fitting the mixture model.

fitMixture

'logical.' Whether to fit per-array mixture model.

quantile.method

character string specifying the quantile normalization method to use ('within' or 'between' channels).

eps

Stop criteria.

verbose

'logical.' Whether to print descriptive messages during processing.

seed

Seed to be used when sampling. Useful for reproducibility

cdfName

character string indicating which annotation package to load.

Details

The normalized intensities are written to disk using package ff protocols for writing/reading to disk. Note that the object CNSet containing the ff objects in the assayData slot will be updated after applying this function.

Value

A ff_matrix object containing parameters for fitting the mixture model. Note that while the CNSet object is not returned by this function, the object will be updated as the normalized intensities are written to disk. In particular, after applying this function the normalized intensities in the alleleA and alleleB elements of assayData are now available.

Author(s)

R. Scharpf

See Also

CNSet-class, A, B, constructInf, genotypeInf, annotationPackages

Examples

1
2
	## See the 'illumina_copynumber' vignette in inst/scripts of
	## the source package

crlmm documentation built on Nov. 8, 2020, 4:55 p.m.