Siccuracy: Stefan's imputation accuracy package

Description Details Overview IDs Output format Other formats Author(s)

Description

Calculation of imputation accuracies and similar allele stats.

Details

This package was developed for working with SNP files in the format used in AlphaSuite. The format of these consists of first column with integer sample ID (Individual or Animal ID), and subsequent column counting minor alleles at each loci:

1001 0 0 1 2 1 2
1002 0.05 0.50 1.50 1.80 0.95 9

In this example, the file consists of two individuals (1001 and 1002); first individual has genotypes (0, 0, 1, 2, 1, 2) for six loci. The second individual has been imputed, and the genotypes are given as real values of best guess. The last genotype was not sufficiently imputed and therefore set as missing with value 9.

Requirements for format: Space or tab separated; number of digits is unimportant.

Overview

IDs

The AlphaImpute format must have integer IDs. In this software package, we have choosen an arbitrary upper limit of 20 digits for IDs. Exceed this limit and the ID will be lumped together with the following genotype.

Regarding repeated IDs, it is best to avoid these.

Output format

Functions that write new files may have a pair of arguments (int and format) that specifies whether the outputted format are integers (int=TRUE) or have decimals. For more information on how to specify the format, see parseformat.

Other formats

Oxford format is covered with read.oxford and a method dispatch for imputation accuracy exists for this too.

SHAPEIT's haps/sample format is covered with read.haps and a method dispatch for imputation accuracy exists for this too.

Currently, VCF objects by package vcfR are supported for functions write.snps and imputation_accuracy. For more information, see VCF format.

Author(s)

Stefan McKinnon Edwards <sme@iysik.com>


stefanedwards/Siccuracy documentation built on May 30, 2019, 10:44 a.m.