write.simple: Fast and flexible writing of snpStats objects to flat files

Description Usage Arguments Details Value Warning Note Author(s) References Examples

View source: R/write.R

Description

Different genetics phasing and analysis programs (beagle, mach, impute, snptest, phase/fastPhase, snphap, etc) have different requirements for input files. These functions aim to make creating these files from a SnpMatrix object straightfoward.

Usage

1
2
3
4
  write.simple(X, a1, a2, file, fsep = "\t", gsep = "",
    nullallele = "N", write.header = TRUE,
    transpose = FALSE, write.sampleid = TRUE, bp = NULL,
    num.coding = FALSE)

Arguments

X

SnpMatrix object

a1

vector of first allele at each SNP

a2

vector of second allele at each SNP

bp

vector of base pair positions for each SNP

fsep,gsep

Field and genotype separators.

nullallele

Character to use for missing alleles

file

Output file name.

write.header

Write a header line

transpose

Output SNPs as rows, samples as columns if TRUE. The default is samples as rows, SNPs as columns, as represented internally by snpStats/SnpMatrix.

write.sampleid

Output sample ids

num.coding

Use alleles 1 and 2 instead of supplying allele vectors.

Details

It's written in C, so should be reasonably fast even for large datasets.

write.simple is the most flexible function. It should be able to write most rectangular based formats.

Additional functions are available tailored to software that require a bit more than a rectangular format: write.beagle, write.impute, write.mach, write.phase.

Value

No return value, but has the side effect of writing specified output files.

Warning

Any uncertain genotypes (stored by snpStats as raw codes 4 to 253) are output as missing.

The functions use "\n" as an end of line character, unless .Platform$OS.type == "windows", when eol is "\r\n". I only have access to linux machines for testing.

I have tested these functions with my own data, but it is always possible that your data may contain quirks mine don't, or that input formats could change for any program mentioned here. Please do have a quick check on a small subset of data (eg, as in the example below), that the output for your exact combination of options looks sensible and matches the specified input format.

Note

This has been tested with SnpMatrix objects from the package snpStats but should also work with snp.matrix objects from the package snpMatrix.

Author(s)

Chris Wallace

References

David Clayton (2012). snpStats: SnpMatrix and XSnpMatrix classes and methods. R package version 1.6.0. http://www-gene.cimr.cam.ac.uk/clayton

phase/fastPhase: http://stephenslab.uchicago.edu/software.html

beagle: http://faculty.washington.edu/browning/beagle/beagle.html

IMPUTE: http://mathgen.stats.ox.ac.uk/impute/impute_v2.html

MACH: http://www.sph.umich.edu/csg/abecasis/MACH

snphap: https://www-gene.cimr.cam.ac.uk/staff/clayton/software/snphap.txt

Examples

1
2
3
4
5
6
7
8
9
data(testdata,package="snpStats")
A.small <- Autosomes[1:6,1:10]
f <- tempfile()
## write in suitable format for snphap
nsnps <- ncol(A.small)
write.simple(A.small, a1=rep("1",nsnps), a2=rep("2",nsnps), gsep=" ",
             nullallele='0', file=f,
                write.sampleid=FALSE)
unlink(f)

snpStatsWriter documentation built on May 2, 2019, 5:41 a.m.