When a large number of samples are being analyzed, it is desirable to have
random access to specific CpG methylation without loading all the data.
SeSAMe provides such interface through the
fileSet object which is
in essence an indexed file-based numeric matrix.
The one function to generate a
fileSet is through the
function. In this case, there is no concrete output from the function. The
consequence is the generation of a file at the given path. One can operate
fileSet by referencing the path to the file.
library(sesame) options(rmarkdown.html_vignette.check_title = FALSE)
openSesameToFile call does three things
- generates a file called
- generates an index file called
- returns a
fileSet object which serves as an interface to the two files.
fset <- openSesameToFile('mybetas', system.file('extdata',package='sesameData'))
When printed to console, the number of samples and the number of probes are shown.
One can obtain the samples and probes information with the
head(fset$samples) # sample IDs head(fset$probes) # probe IDs
One can query the specific CpG by probe name(s) and sample name(s). Note that every query to fset is a disk read. Therefore it can be slower than in-memory processing. Here we only retrieve the beta values for the two probes cg00006414 and cg00007981 in the sample 4207113116_B.
sliceFileSet(fset, '4207113116_B', c('cg00006414','cg00007981'))
In the previous example, we preprocessed IDATs directly to
fileSet. We can
also read a pre-existing
fileSet using the file path using
fset <- readFileSet('mybetas') sliceFileSet(fset, '4207113116_A', 'cg00000292')
fileSet size is always fixed. One cannot dynamically expand or shrink a
fileSet. We can write a fileSet by filling the space one sample by one sample.
This is achieved by first allocating the space given the number of samples
and the probe IDs (optional if platform is one if HM27, HM450 or EPIC).
fset2 <- initFileSet('mybetas2', 'HM450', c('sample1', 'sample2'))
Then one can fill in the beta values by
mapFileSet. Here I am
illustrating using a randomly generated beta values.
hypothetical_betas <- setNames(runif(fset2$n), fset2$probes) mapFileSet(fset2, 'sample2', hypothetical_betas)
The mapped value should be equal to the generated beta value. Let's spot-check.
abs(sliceFileSet(fset2,'sample2','cg00000108') - hypothetical_betas['cg00000108']) < 1e-7
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.