Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/anomDetectLOH.R
anomDetectLOH
breaks a chromosome up into segments of homozygous runs
of SNP markers determined by change points in Log R Ratio and
selects segments which are likely to be anomalous.
1 2 3 4 5 6 | anomDetectLOH(intenData, genoData, scan.ids, chrom.ids, snp.ids,
known.anoms, smooth = 50, min.width = 5, nperm = 10000, alpha = 0.001,
run.size = 50, inter.size = 4, homodel.min.num = 10, homodel.thresh = 10,
small.num = 20, small.thresh = 2.25, medium.num = 50, medium.thresh = 2,
long.num = 100, long.thresh = 1.5, small.na.thresh = 2.5,
length.factor = 5, merge.fac = 0.85, min.lrr.num = 20, verbose = TRUE)
|
intenData |
An |
genoData |
A |
scan.ids |
vector of scan ids (sample numbers) to process |
chrom.ids |
vector of (unique) chromosomes to process. Should correspond to
integer chromosome codes in |
snp.ids |
vector of eligible snp ids. Usually exclude failed and intensity-only snps.
Also recommended to exclude an HLA region on chromosome 6 and
XTR region on X chromosome. See |
known.anoms |
data.frame of known anomalies (usually from |
smooth |
number of markers for smoothing region. See |
min.width |
minimum number of markers for segmenting. See |
nperm |
number of permutations.
See |
alpha |
significance level. See |
run.size |
number of markers to declare a 'homozygous' run (here 'homozygous' includes homozygous and missing) |
inter.size |
number of consecutive heterozygous markers allowed to interrupt a 'homozygous' run |
homodel.min.num |
minimum number of markers to detect extreme difference in lrr (for homozygous deletion) |
homodel.thresh |
threshold for measure of deviation from non-anomalous needed to declare segment a homozygous deletion. |
small.num |
minimum number of SNP markers to declare segment as an anomaly (other than homozygous deletion) |
small.thresh |
threshold for measure of deviation from non-anomalous to declare segment anomalous if
number of SNP markers is between |
medium.num |
threshold for number of SNP markers to identify 'medium' size segment |
medium.thresh |
threshold for measure of deviation from non-anomalous needed to declare segment anomalous if
number of SNP markers is between |
long.num |
threshold for number of SNP markers to identify 'long' size segment |
long.thresh |
threshold for measure of deviation from non-anomalous when number of markers is bigger than |
small.na.thresh |
threshold measure of deviation from non-anomalous when number of markers is between |
length.factor |
window around anomaly defined as |
merge.fac |
threshold for 'sd.fac'= number of baseline standard deviations of segment mean from baseline mean; consecutive segments with 'sd.fac' above threshold are merged |
min.lrr.num |
if any 'non-anomalous' interval has fewer markers than |
verbose |
logical indicator whether to print the scan id currently being processed |
Detection of anomalies with loss of heterozygosity accompanied by change in Log R Ratio. Male samples for X chromosome are not processed.
Circular binary segmentation (CBS) (using the R-package DNAcopy)
is applied to LRR values and, in parallel, runs of homozygous or missing genotypes
of a certain minimal size (run.size
) (and allowing for some interruptions
by no more than inter.size
heterozygous SNPs ) are identified. Intervals from
known.anoms
are excluded from the identification of runs.
After some possible merging of consecutive CBS segments
(based on satisfying a threshold merge.fac
for deviation
from non-anomalous baseline), the homozygous runs are intersected
with the segments from CBS.
Determination of anomalous segments is based on
a combination of number-of-marker thresholds and deviation from a non-anomalous
baseline. Segments are declared anomalous if deviation from non-anomalous is above
corresponding thresholds. (See small.num
, small.thresh
, medium.num
,medium.thresh
,
long.num
,long.thresh
,and small.na.thresh
.)
Non-anomalous median and MAD are defined for each sample-chromosome combination.
Intervals from known.anoms
and the homozygous runs
identified are excluded; remaining regions are the non-anomalous baseline.
Deviation from non-anomalous is measured by
a combination of a chromosome-wide 'mad.fac' and a 'local mad.fac' (both the average
and the minimum of these two measures are used).
Here 'mad.fac' is (segment median-non-anomalous median)/(non-anomalous MAD) and
'local mad.fac' is the same definition except the non-anomalous median and MAD
are computed over a window including the segment (see length.factor
).
Median and MADare found for eligible LRR values.
A list with the following elements:
raw |
raw homozygous run data, not including any regions present in
|
raw.adjusted |
data.frame of runs after merging and intersecting with CBS segments, with the following columns: Left and right refer to start and end of anomaly, respectively, in position order.
|
filtered |
data.frame of the segments identified as anomalies. Columns are the
same as in |
base.info |
data.frame with columns:
|
segments |
data.frame of the segmentation found by CBS with columns:
|
merge |
data.frame of scan id/chromosome pairs for which merging occurred.
|
Cecelia Laurie
See references in segment
in the package DNAcopy.
segment
and smooth.CNA
in the package DNAcopy,
also findBAFvariance
, anomDetectLOH
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | library(GWASdata)
data(illuminaScanADF, illuminaSnpADF)
blfile <- system.file("extdata", "illumina_bl.gds", package="GWASdata")
bl <- GdsIntensityReader(blfile)
blData <- IntensityData(bl, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
genofile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
geno <- GdsGenotypeReader(genofile)
genoData <- GenotypeData(geno, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
scan.ids <- illuminaScanADF$scanID[1:2]
chrom.ids <- unique(illuminaSnpADF$chromosome)
snp.ids <- illuminaSnpADF$snpID[illuminaSnpADF$missing.n1 < 1]
# example for known.anoms, would get this from anomDetectBAF
known.anoms <- data.frame("scanID"=scan.ids[1],"chromosome"=21,
"left.index"=100,"right.index"=200)
LOH.anom <- anomDetectLOH(blData, genoData, scan.ids=scan.ids,
chrom.ids=chrom.ids, snp.ids=snp.ids, known.anoms=known.anoms)
close(blData)
close(genoData)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.