README.md

gdsfmt: R Interface to CoreArray Genomic Data Structure (GDS) files

Version: 1.1.3

Build Status

Features

This package provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files, which are portable across platforms and include hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers with less than 8 bits, since a single genetic/genomic variant, like single-nucleotide polymorphism, usually occupies fewer bits than a byte. Data compression and decompression are also supported with relatively efficient random access. It is allowed to read a GDS file in parallel with multiple R processes supported by the parallel package.

Importance

The version 1.1.3 should be installed immediately, if you see the error like

Invalid Zip Deflate Stream operation 'Seek'!

Changes in v1.1.1 - 1.1.3:

Changes in v1.1.0:

* fully support big-endian systems

License

LGPLv3 LGPL-3

Citation

Xiuwen Zheng, David Levine, Jess Shen, Stephanie M. Gogarten, Cathy Laurie, Bruce S. Weir. A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics 2012; doi:10.1093/bioinformatics/bts606

Package Maintainer

Xiuwen Zheng (zhengxwen@gmail.com / zhengx@u.washington.edu)

URL

http://github.com/zhengxwen/gdsfmt

Unit Testing

Comprehensive unit testing: http://github.com/zhengxwen/unittest.gdsfmt

Examples

  1. Limited random-access reading on compressed data
  2. Transpose a matrix

Installation

library("devtools")
install_github("zhengxwen/gdsfmt")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

install.packages("gdsfmt", repos="http://R-Forge.R-project.org")
wget --no-check-certificate https://github.com/zhengxwen/gdsfmt/tarball/master -O gdsfmt_latest.tar.gz
** Or **
curl -L https://github.com/zhengxwen/gdsfmt/tarball/master/ -o gdsfmt_latest.tar.gz

** Install **
R CMD INSTALL gdsfmt_latest.tar.gz

Copyright notice



Try the gdsfmt package in your browser

Any scripts or data that you put into this service are public.

gdsfmt documentation built on May 2, 2019, 4:41 p.m.