View source: R/anomDetectLOH.R
| anomDetectLOH | R Documentation | 
anomDetectLOH breaks a chromosome up into segments of homozygous runs
of SNP markers determined by change points in Log R Ratio and 
selects segments which are likely to be anomalous.
anomDetectLOH(intenData, genoData, scan.ids, chrom.ids, snp.ids,
  known.anoms, smooth = 50, min.width = 5, nperm = 10000, alpha = 0.001,
  run.size = 50, inter.size = 4, homodel.min.num = 10, homodel.thresh = 10,
  small.num = 20, small.thresh = 2.25, medium.num = 50, medium.thresh = 2,
  long.num = 100, long.thresh = 1.5, small.na.thresh = 2.5,
  length.factor = 5, merge.fac = 0.85, min.lrr.num = 20, verbose = TRUE)
| intenData | An  | 
| genoData | A  | 
| scan.ids | vector of scan ids (sample numbers) to process | 
| chrom.ids | vector of (unique) chromosomes to process.  Should correspond to
integer chromosome codes in  | 
| snp.ids | vector of eligible snp ids.  Usually exclude failed and intensity-only snps.
Also recommended to exclude an HLA region on chromosome 6 and
XTR region on X chromosome.  See  | 
| known.anoms | data.frame of known anomalies (usually from  | 
| smooth | number of markers for smoothing region.  See  | 
| min.width | minimum number of markers for segmenting.  See  | 
| nperm | number of permutations.
See  | 
| alpha | significance level. See  | 
| run.size | number of markers to declare a 'homozygous' run (here 'homozygous' includes homozygous and missing) | 
| inter.size | number of consecutive heterozygous markers allowed to interrupt a 'homozygous' run | 
| homodel.min.num | minimum number of markers to detect extreme difference in lrr (for homozygous deletion) | 
| homodel.thresh | threshold for measure of deviation from non-anomalous needed to declare segment a homozygous deletion. | 
| small.num | minimum number of SNP markers to declare segment as an anomaly (other than homozygous deletion) | 
| small.thresh | threshold for measure of deviation from non-anomalous to declare segment anomalous if
number of SNP markers is between  | 
| medium.num | threshold for number of SNP markers to identify 'medium' size segment | 
| medium.thresh | threshold for measure of deviation from non-anomalous needed to declare segment anomalous if
number of SNP markers is between  | 
| long.num | threshold for number of SNP markers to identify 'long' size segment | 
| long.thresh | threshold for measure of deviation from non-anomalous when number of markers is bigger than  | 
| small.na.thresh | threshold measure of deviation from non-anomalous when number of markers is between  | 
| length.factor | window around anomaly defined as  | 
| merge.fac | threshold for 'sd.fac'= number of baseline standard deviations of segment mean from baseline mean; consecutive segments with 'sd.fac' above threshold are merged | 
| min.lrr.num | if any 'non-anomalous' interval has fewer markers than  | 
| verbose | logical indicator whether to print the scan id currently being processed | 
Detection of anomalies with loss of heterozygosity accompanied by change in Log R Ratio. Male samples for X chromosome are not processed.
Circular binary segmentation (CBS) (using the R-package DNAcopy)
is applied to LRR values and, in parallel, runs of homozygous or missing genotypes 
of a certain minimal size (run.size) (and allowing for some interruptions 
by no more than inter.size heterozygous SNPs ) are identified.  Intervals from
known.anoms are excluded from the identification of runs.
After some possible merging of consecutive CBS segments 
(based on satisfying a threshold  merge.fac for deviation 
from non-anomalous baseline), the homozygous runs are intersected 
with the segments from CBS. 
Determination of anomalous segments is based on 
a combination of number-of-marker thresholds and deviation from a non-anomalous 
baseline.  Segments are declared anomalous if deviation from non-anomalous is above
corresponding thresholds. (See small.num, small.thresh, medium.num,medium.thresh,
long.num,long.thresh,and small.na.thresh.) 
Non-anomalous median and MAD are defined for each sample-chromosome combination.
Intervals from known.anoms and the homozygous runs 
identified are excluded; remaining regions are the non-anomalous baseline. 
Deviation from non-anomalous is measured by  
a combination of a chromosome-wide 'mad.fac' and a 'local mad.fac' (both the average
and the minimum of these two measures are used). 
Here 'mad.fac' is (segment median-non-anomalous median)/(non-anomalous MAD) and
'local mad.fac' is the same definition except the non-anomalous median and MAD
are computed over a window including the segment (see length.factor).
Median and MADare found for eligible LRR values.
A list with the following elements:
| raw | raw homozygous run data, not including any regions present in  
 | 
| raw.adjusted | data.frame of runs after merging and intersecting with CBS segments, with the following columns: Left and right refer to start and end of anomaly, respectively, in position order. 
 | 
| filtered |  data.frame of the segments identified as anomalies.  Columns are the
same as in  | 
| base.info | data.frame with columns: 
 | 
| segments | data.frame of the segmentation found by CBS with columns: 
 | 
| merge | data.frame of scan id/chromosome pairs for which merging occurred. 
 | 
Cecelia Laurie
See references in segment in the package DNAcopy.
segment and smooth.CNA in the package DNAcopy, 
also findBAFvariance, anomDetectLOH
library(GWASdata)
data(illuminaScanADF, illuminaSnpADF)
blfile <- system.file("extdata", "illumina_bl.gds", package="GWASdata")
bl <- GdsIntensityReader(blfile)
blData <-  IntensityData(bl, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
genofile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
geno <- GdsGenotypeReader(genofile)
genoData <-  GenotypeData(geno, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
scan.ids <- illuminaScanADF$scanID[1:2]
chrom.ids <- unique(illuminaSnpADF$chromosome)
snp.ids <- illuminaSnpADF$snpID[illuminaSnpADF$missing.n1 < 1]
# example for known.anoms, would get this from anomDetectBAF
known.anoms <- data.frame("scanID"=scan.ids[1],"chromosome"=21,
  "left.index"=100,"right.index"=200)
LOH.anom <- anomDetectLOH(blData, genoData, scan.ids=scan.ids,
  chrom.ids=chrom.ids, snp.ids=snp.ids, known.anoms=known.anoms)
close(blData)
close(genoData)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.