SNPbin: Formal class "SNPbin"

Description Objects from the class SNPbin Slots Methods Author(s) See Also Examples

Description

The class SNPbin is a formal (S4) class for storing a genotype of binary SNPs in a compact way, using a bit-level coding scheme. This storage is most efficient with haploid data, where the memory taken to represent data can reduced more than 50 times. However, SNPbin can be used for any level of ploidy, and still remain an efficient storage mode.

A SNPbin object can be constructed from a vector of integers giving the number of the second allele for each locus.

SNPbin stores a single genotype. To store multiple genotypes, use the genlight class.

Objects from the class SNPbin

SNPbin objects can be created by calls to new("SNPbin", ...), where '...' can be the following arguments:

snp

a vector of integers or numeric giving numbers of copies of the second alleles for each locus. If only one unnamed argument is provided to 'new', it is considered as this one.

ploidy

an integer indicating the ploidy of the genotype; if not provided, will be guessed from the data (as the maximum from the 'snp' input vector).

label

an optional character string serving as a label for the genotype.

Slots

The following slots are the content of instances of the class SNPbin; note that in most cases, it is better to retrieve information via accessors (see below), rather than by accessing the slots manually.

snp:

a list of vectors with the class raw.

n.loc:

an integer indicating the number of SNPs of the genotype.

NA.posi:

a vector of integer giving the position of missing data.

label:

an optional character string serving as a label for the genotype..

ploidy:

an integer indicating the ploidy of the genotype.

Methods

Here is a list of methods available for SNPbin objects. Most of these methods are accessors, that is, functions which are used to retrieve the content of the object. Specific manpages can exist for accessors with more than one argument. These are indicated by a '*' symbol next to the method's name. This list also contains methods for conversion from SNPbin to other classes.

[

signature(x = "SNPbin"): usual method to subset objects in R. The argument indicates how SNPs are to be subsetted. It can be a vector of signed integers or of logicals.

show

signature(x = "SNPbin"): printing of the object.

$

signature(x = "SNPbin"): similar to the @ operator; used to access the content of slots of the object.

$<-

signature(x = "SNPbin"): similar to the @ operator; used to replace the content of slots of the object.

nLoc

signature(x = "SNPbin"): returns the number of SNPs in the object.

names

signature(x = "SNPbin"): returns the names of the slots of the object.

ploidy

signature(x = "SNPbin"): returns the ploidy of the genotype.

as.integer

signature(x = "SNPbin"): converts a SNPbin object to a vector of integers. The S4 method 'as' can be used as well (e.g. as(x, "integer")).

cbind

signature(x = "SNPbin"): merges genotyping of the same individual at different SNPs (all stored as SNPbin objects) into a single SNPbin.

c

signature(x = "SNPbin"): same as cbind.SNPbin.

Author(s)

Thibaut Jombart (t.jombart@imperial.ac.uk)

See Also

Related class:
- genlight, for storing multiple binary SNP genotypes.
- genind, for storing other types of genetic markers.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
## Not run: 
#### HAPLOID EXAMPLE ####
## create a genotype of 100,000 SNPs
dat <- sample(c(0,1,NA), 1e5, prob=c(.495, .495, .01), replace=TRUE)
dat[1:10]
x <- new("SNPbin", dat)
x
x[1:10] # subsetting
as.integer(x[1:10])

## try a few accessors
ploidy(x)
nLoc(x)
head(x$snp[[1]]) # internal bit-level coding

## check that conversion is OK
identical(as(x, "integer"),as.integer(dat)) # SHOULD BE TRUE

## compare the size of the objects
print(object.size(dat), unit="auto")
print(object.size(x), unit="auto")
object.size(dat)/object.size(x) # EFFICIENCY OF CONVERSION


#### TETRAPLOID EXAMPLE ####
## create a genotype of 100,000 SNPs
dat <- sample(c(0:4,NA), 1e5, prob=c(rep(.995/5,5), 0.005), replace=TRUE)
x <- new("SNPbin", dat)
identical(as(x, "integer"),as.integer(dat)) # MUST BE TRUE

## compare the size of the objects
print(object.size(dat), unit="auto")
print(object.size(x), unit="auto")
object.size(dat)/object.size(x) # EFFICIENCY OF CONVERSION


#### c, cbind ####
a <- new("SNPbin", c(1,1,1,1,1))
b <- new("SNPbin", c(0,0,0,0,0))
a
b
ab <- c(a,b)
ab
identical(c(a,b),cbind(a,b))
as.integer(ab)

## End(Not run)

Example output

Loading required package: ade4

   /// adegenet 2.0.1 is loaded ////////////

   > overview: '?adegenet'
   > tutorials/doc/questions: 'adegenetWeb()' 
   > bug reports/feature requests: adegenetIssues()


 [1]  1  0  0  0  1  0  0  0  0 NA
/// SNPBIN OBJECT /////////
 100,000 SNPs coded as bits, size: 17.5 Kb
 Ploidy: 1
 1013 (1.01 %) missing data
/// SNPBIN OBJECT /////////
 10 SNPs coded as bits, size: 1.3 Kb
 Ploidy: 1
 1 (10 %) missing data
 [1]  1  0  0  0  1  0  0  0  0 NA
[1] 1
[1] 100000
[1] 11 00 58 2e e8 b2
[1] TRUE
781.3 Kb
17.5 Kb
44.7 bytes
[1] TRUE
390.7 Kb
52.1 Kb
7.5 bytes
/// SNPBIN OBJECT /////////
 5 SNPs coded as bits, size: 1.3 Kb
 Ploidy: 1
 0 (0 %) missing data
/// SNPBIN OBJECT /////////
 5 SNPs coded as bits, size: 1.3 Kb
 Ploidy: 1
 0 (0 %) missing data
/// SNPBIN OBJECT /////////
 10 SNPs coded as bits, size: 1.3 Kb
 Ploidy: 1
 0 (0 %) missing data
[1] TRUE
 [1] 1 1 1 1 1 0 0 0 0 0

adegenet documentation built on July 18, 2021, 1:06 a.m.