Functions to correct for global correlations between color channels or between successive sequencing cycles

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
## S4 method for signature 'SolexaIntensity'
DeCorrelateChannels(int,cycles=seq(1,dim(int)[3],by=1),theta=matrix(rep(c(0.8806742,1.3727418,0.883865,1.545728),length(cycles)),ncol=4,byrow=TRUE))
## S4 method for signature 'array'
DeCorrelateChannels(int,cycles=seq(1,dim(int)[3],by=1),theta=matrix(rep(c(0.8806742,1.3727418,0.883865,1.545728),length(cycles)),ncol=4,byrow=TRUE))
DeCorrelateChannels(int,...)
## S4 method for signature 'SolexaIntensity'
OptimizeAngle(int,cycles=seq(1,dim(int)[3],by=1),...)
OptimizeAngle(int,...)
## S4 method for signature 'SolexaIntensity'
DeCorrelateCycles(int,ncycles=dim(int)[3],rate=1.8e-2)
## S4 method for signature 'array'
DeCorrelateCycles(int,ncycles=dim(int)[3],rate=1.8e-2)
DeCorrelateCycles(int,...)
## S4 method for signature 'SolexaIntensity'
OptimizeRate(int,ncycles=dim(int)[3],...)
OptimizeRate(int,...)
## S4 method for signature 'RolexaRun'
TileNormalize(run=Rolexa.env,int,cycles=seq(1,dim(int)[3],by=1))
TileNormalize(run,...)
``` |

`run` |
a |

`int` |
a |

`cycles, ncycles` |
the cycles or the number of cycles (starting from 1) to apply the correction to |

`theta` |
a |

`rate` |
the rate of nucleotide mis-incorporation at each cycle |

`...` |
additional arguments passed to |

`DeCorrelateChannels`

applies to coordinate transforms:
one transforming the axes 1,2 to the axes with angles
`theta[,1:2]`

relative to axis 1, and similarly with axes 3,4
and angles `theta[,3:4]`

. These angles can be calculated with
`OptimizeAngle`

which minimizes the correlations between channel
1 and 2, and between channel 3 and 4, for each
cycle. `DeCorrelateCycles`

assumes that at each cycles, a
fraction `rate`

of sequences fail to incorporate any nucleotides
and therefore the sequence lengths at each colony display a binomial
distribution which is corrected for by taking into account the
intensity measured at previous cycles. `OptimizeRate`

calculates
a rate that minimizes correlations between consecutives cycles.

`TileNormalize`

estimates the local trend by `loess`

fitting
of the model `int ~ x+y`

and substracts it from the intensity
matrix.

`TileNormalize`

, `DeCorrelateChannels`

and `DeCorrelateCycles`

return an
object of the same type as `int`

corrected for spurious
correlations. `OptimizeAngle`

returns an `length(cycles)*4`

matrix and `OptimizeRate`

returns a single positive real
number.

Jacques Rougemont, Arnaud Amzallag, Christian Iseli, Laurent Farinelli, Ioannis Xenarios, Felix Naef

Probabilistic base calling of Solexa sequencing data, BMC Bioinformatics 2008, 9:431

1 2 3 4 5 6 7 8 9 | ```
path = SolexaPath(system.file("extdata", package="ShortRead"))
rolenv = SetModel(idsep="_")
int = readIntensities(path,pattern="s_1_0001",withVariability=FALSE)
int1 = DeCorrelateChannels(int=int,cycles=1:5,theta=OptimizeAngle(int=int,cycles=1:5))
int2 = DeCorrelateCycles(int=int1,ncycles=5,rate=OptimizeRate(int=int1))
int3 = TileNormalize(run=rolenv,int=int,cycles=1)
seq = CombineReads(run=rolenv,path=path,pattern="s_1_0001_seq*")
PlotCycles(run=rolenv,int=int3,seq=seq,cycles=1:4)
``` |

