This vignette introduces you to some core functionality in the oncosnputils R package.
The example data we are using in this package is from a Affymetrix SNP 6.0 CEL file of the breast cancer cell-line HCC1395. The raw CEL file was pre-processed through the PennCNV-Affy protocol and then run through OncoSNP (1.3) with only the SNP probes.
The oncosnputils R package provides several load functions for different input and output files of oncosnp. This functions rely on the data.table fread() function which relies for fast reading and then renames several columns so that they easier to work with in R. For instance, we can load the quality control file:
library("oncosnputils") qcFile <- system.file("extdata", "HCC1395.qc", package = "oncosnputils") qcDf <- LoadOncosnpQcFile(qcFile) knitr::kable(qcDf)
Notice the renaming of some of the columns. This just faciliates downstream analyses as the default output column names are difficult to work with in R. We also load the OncoSNP CNV file and the PennCNV probe file.
# Loading the OncoSNP CNV file cnvFile <- system.file("extdata", "HCC1395.cnvs", package = "oncosnputils") cnvDf <- LoadOncosnpCnvFile(cnvFile) # Show only first 10 rows and 10 columns for vignette purposes knitr::kable(cnvDf[1:10, 1:10])
# Loading the PennCNV probe file probeFile <- system.file("extdata", "logR_BAF.snp_probes.txt", package = "oncosnputils") probeDf <- LoadPennCNVProbeData(probeFile) # Show only first 10 rows for vignette purposes knitr::kable(probeDf[1:10, ])
All these loading functions will return data.table objects. These are enhanced data.frames which are suitable for large data analysis.
The oncosnputils R package provides several post-processing functions that will enhance the OncoSNP inputs/outputs. For instance, the standard output (.cnvs) from OncoSNP do not contain the LRR or BAF values of the segments.
head(cnvDf)
We can add this information to these files by using the AddLRRBAF2OncosnpCNV
function:
# only add for the first 10 segments for vignette purposes cnvDf.LRR.BAF <- AddLRRBAF2OncosnpCNV(cnvDf[1:10, ], qcDf, probeDf)
Now the LRR, BAF along with the number of probes in each segment have been added as additional columns to the cnvDf
data.frame. The qcDf and probeDf need to be passed in as input as the functions needs to determine the LRR shift and also the overlapping probes wth each CNV segment.
Also, the AddOncosnp2PennCNVProbe
can add the OncoSNP segment state information to the PennCNV raw probe input into OncoSNP.
# only add for the first 5000 probes for vignette purposes probeDf.oncosnp <- AddOncosnp2PennCNVProbe(cnvDf, qcDf, probeDf[1:5000, ]) head(probeDf.oncosnp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.