This package contains functions to interact with tally data from NGS experiments that is stored in HDF5 files. For detail see vignettes shipped with this package.
Package: | h5vc |
Type: | Package |
Version: | 1.0.4 |
Date: | 2013-10-11 |
License: | GPL (>= 3) |
This package is desgned to facilitate the analysis of genomics data through tallies stored in a HDF5 file. Within a HDF5 file the tally is simply a table of bases times genomic positions listing for each position the count of each base observed as a mismatch in the sample at any given position. Strand and sample are additional dimension in this array, which leads to a 4D-array called 'Counts'. The total coverage is stored in a separate array of 3 dimensions (Sample x Strand x Genomic Position) called 'Coverages', there is a 3 dimensional 'Deletions' array and a 1D-vector encoding the reference base ('Reference'). Those 4 arrays are stored as datasets within a HDF5 tally file in which the group-structure of the tally file encodes for the organisatorial levels of 'Study' and 'Chromosome'. For details on the layout of HDF5 files visit (http://www.hdfgroup.org), a short description is given in the vignettes.
Creating those HDF5 tally files can be accomplished from within R
or through a Python script that will generate a tally file from a set of
.bam files. The workflow is described in the vignettes
h5vc.creating.tallies
and h5vc.creating.tallies.within.R
.
Paul Pyl Maintainer: Paul Pyl pyl@embl.de
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.