BAF.transform: Transform BAF into mBAF

Description Usage Arguments Details Value Author(s) References Examples

Description

This function is dedicated to transform BAF value into mirrored BAF (mBAF) value. Non-informative SNPs for CNV inference have been removed, while missing values for those removed SNPs are initialized with the average of nearest SNPs.

Usage

1
2
BAF.transform(x, gt = NULL, mBAF.thd = 0.97, win.thd = 0.8, 
              w = 1, k = 2, median.adjust = FALSE)

Arguments

x

A vector of BAF values to be transformed.

gt

In tumor data set, if the tumor sample under investigation has matched normal tissue sample, gt indicates the vector of the genotypes of SNPs in matched normal sample. If no such information can be supplied, it is set NULL as default.

mBAF.thd

A criteria to remove non-informative SNPs if no information from matched normal tissue is supplied. See reference for more details.

win.thd

A further criteria to remove possible non-informative SNPs which might pass the mBAF.thd criteria. See reference for more details.

w

The window size used in computation of a quantity to be compared with win.thd. The default is 1. See reference for more details.

k

The number of nearest SNPs used to computed the initialized values of removed non-informative SNPs.

median.adjust

Logical. If it is TRUE, the median of BAF value in between 0.25 and 0.75 will be adjusted to 0.5 first before any transformation applied.

Details

More details about the transformation are referred to Staaf J., et al. (2008). The missing values for removed non-informative SNPs are initialized with the average of k-nearest SNPs plus a normal random noise in order to eliminate the dependence of adjacent SNPs.

Value

All returned information is collected into a list

mBAF

A vector of mirrored BAF values. Missing values of removed non-informative SNPs are initialized for downstream analysis.

idx

A vector of indices of those informative SNPs with values remaining after transformation.

idx.na

A vector of indices of those non-informative SNPs with orignal values removed.

Author(s)

Zhongyang (Thomas) Zhang, zhangzy@ucla.edu

References

Staaf J., et al. (2008) Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays. Genome Biology, 9: R136+.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## simulate a sequence of BAF values for 100 SNPs
xf <- sample(x=c(0,0.5,1),size=100,replace=TRUE,prob=c(0.25,0.5,0.25)) + rnorm(100,0,0.02)
xf[xf<0] <- 0
xf[xf>1] <- 1
## insert the signal pattern of a duplcation in the middle of x1
xm <- sample(x=c(0,1),size=20,replace=TRUE,prob=c(0.5,0.5)) + rnorm(20,0,0.02)
xm[xm<0] <- 0
xm[xm>1] <- 1
xf[41:60] <- 2/3*xf[41:60] + 1/3*xm
BAF <- xf
plot(BAF,xlab="SNP",ylab="BAF")

## tranform BAF to mBAF
res <- BAF.transform(x=BAF, gt = NULL, mBAF.thd = 0.97, win.thd = 0.8, 
              w = 1, k = 2, median.adjust = FALSE)
plot(res$mBAF,type="n",xlab="SNP",ylab="mBAF")
points(res$idx,res$mBAF[res$idx])
points(res$idx.na,res$mBAF[res$idx.na],col="gray")

Piet documentation built on May 2, 2019, 5:19 p.m.