genotypes: Constructor for a 'genotypes' object

Description Usage Arguments Details Value The genotypes class The marker map The "pedigree" Filters

Description

Constructor for a genotypes object

Usage

1
2
3
genotypes(G, map, ped = NULL, alleles = c("auto", "native", "01",
  "relative"), intensity = NULL, normalized = FALSE, filter.sites = NULL,
  filter.samples = NULL, check = TRUE, ...)

Arguments

G

a genotype matrix with markers in rows and samples in columns, with both row and column names

map

a valid marker map (see Deatils) corresponding to G, with row names

ped

a valid "pedigree" (dataframe containing sample metadata)

alleles

character vector describing allele encoding (see argyle for details); "auto" lets the package try to guess the encoding

intensity

a list with elements x and y containing hybridization intensities; each is a matrix with same dimensions and same row/column names as G

normalized

logical; have intensities been normalized?

filter.sites

character vector of filters attached to markers

filter.samples

character vector of filters attached to samples

check

logical; if TRUE, do sanity checks on input

...

ignored

Details

The input matrix G *must* have row and column names to help the package keep the marker map, sample metadata, and genotypes themselves in sync.

Value

a new genotypes object

The genotypes class

This class is designed to be a lightweight container for genotype data on a set of samples typed for a panel of biallelic SNP markers on a microarray. The object inherits from base-R's class matrix, so any code which accepts a matrix (including the apply family) will work on a genotypes object.

Attributes of genotypes objects include:

All attributes are maintained "parallel" to the genotypes matrix itself, and additionally have names to avoid ambiguity.

Note that missing values (NAs/NaNs) are used for no-calls, in order to take advantage of R's behaviors on missing data.

The marker map

A valid marker map is a required attribute of a genotypes object. It is a dataframe with (at least) the following columns, in the following order. Columns followed by an asterisk (*) are optional but may be required for some downstream operations.

Rownames must be present and must match the contents of column "marker".

The "pedigree"

Although "pedigree" is used in homage to the nomenclature of the PLINK package, this attribute simply contains sample metadata even if true pedigrees are unknown. It is a dataframe with (at least) the following columns, the first 6 of which are for PLINK compatibility, in the following order.

Rownames must be present and must match the contents of column "iid". The pedigree is auto-generated when missing, and in that case every sample is assigned an "fid" identical to its "iid".

Filters

The filter.* fields are character vectors describing the filter(s), if any, with which to mark markers or samples. An empy string ("") indicates a "passing" marker or sample. Filters are appended to the filter string as single characters: H for excess heterozygosity; N for excess no-call rate; I (for sampes only) for abnormal intensity pattern; F (for markers only) for abberrant allele frequency.


andrewparkermorgan/argyle documentation built on May 10, 2019, 11:08 a.m.