qtl.cross: Read genomic data to perform QTL analyses.

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

This function reads genomic data and is similar to the read.cross function from r/qtl package (Broman and Sen, 2009) but allows importing data from a flapjack format (Milne et al., 2010). The files required include a file containing phenotypic information (P.data), a file containing genotypic information (G.data), and a file containing map information (map.data) for all markers.

Usage

1
2
qtl.cross (P.data = NULL, G.data, map.data, cross, heterozygotes = TRUE,
                  sep = "\t" )

Arguments

P.data

Name of the file containing phenotypic information. Each row represents the individuals while each column represents the phenotypic traits. The first column should be labeled as 'genotype' and should contain identification name for each individual. The name of each trait should also be included.

G.data

Name of the file containing genotypic (marker scores) information. Each row represents the individuals while each column represents the markers. Headers for markers should be included, but not for genotypes. The first column contains the names of the genotypes. The first row contains the names of the markers. The marker genotypes are coded by two characters corresponding to the alleles using a separator between alleles (by default a slash /). If a single character is given, the genotype is assumed to be homozygous. Missing values are indicated by default with '-'. In the example below, the two alleles have been called 1 and 2 because it is useful to link alleles to their origin, i.e. parent 1 or parent 2. Therefore, 1 corresponds to homozygous for allele 1 (synonymous to 1/1), 1/2 corresponds to heterozygous, and 2 corresponds to homozygous for allele 2 (synonymous to 2/2). In the case of partially informative markers (e.g. dominant markers) genotypes are coded as 1/- or 2/-, depending on whether the dominant allele originated from parent 1 or parent 2.

map.data

Name of the file containing marker map information (i.e. linkage group and position within linkage group). The file is a text tab delimited file. Each row represents markers. The file consists of three columns. Column 1 gives the marker names, column 2 the chromosome on which the marker has been mapped, and column 3 indicates the position of the marker within the chromosome.

cross

The type of population studied. The type of population studied. Options are: F2 (f2), doubled haploids (dh), backcross (bc), recombinant inbred lines from selfing (riself, ri4self, or ri8self depending on the number of parents used), recombinant inbred lines from sib-mating (risib, ri4sib, or ri8sib depending on the number of parents used), segregating F1 cross-pollinated populations (cp),

heterozygotes

It indicates whether there are heterozygotes or not in the association mapping population. FALSE is set as default.

sep

To define the espace between the data.

Details

The function creates an intermediate file called 'temp.csv' and then uses the read.cross from r/qtl to read it. The output object is an object of class=cross, the same as the one produced by the function read.cross in r/qtl (Broman and Sen, 2009)

Value

Creates an object of class cross to be used in QTL analysis. The components are the same as r/qtl (Broman and Sen, 2009): geno This is a list with elements corresponding to chromosomes.names (geno) contains the names of the chromosomes. Each chromosome is itself a list, and is given class A or X according to whether it is autosomal or the X chromosome.There are two components for each chromosome: data, a matrix whose rows are individuals and whose columns are markers, and map, either a vector of marker positions (in cM) or a matrix of dim (2 x n.mar) where the rows correspond to marker positions in female and male genetic distance, respectively.The genotype data gets converted into numeric codes, as follows. The genotype data for a backcross is coded as NA = missing, 1 = AA, 2 = AB. For an F2 intercross, the coding is NA = missing, 1 = AA, 2 = AB, 3 = BB, 4 = not BB (i.e. AA or AB; D in Mapmaker/qtl), 5 = not AA (i.e. AB or BB; C in Mapmaker/qtl). For a 4-way cross, the mother and father are assumed to have genotypes AB and CD, respectively. The genotype data for the progeny is assumed to be phase-known, with the following coding scheme: NA = missing, 1 = AC, 2 = BC, 3 = AD, 4 = BD, 5 = A = AC or AD, 6 = B = BC or BD, 7 = C = AC or BC, 8 = D = AD or BD, 9 = AC or BD, 10 = AD or BC, 11 = not AC, 12 = not BC, 13 = not AD, 14 = not BD. pheno a data.frame of size (n.ind x n.phe) containing the phenotypes. If a phenotype with the name genotype is included, these identifiers will be used in top.errorlod, plotErrorlod, and plotGeno as identifiers for the individual.

Note

All functions in this package uses cross data style.

Author(s)

Lucia Gutierrez.

References

Broman KW, Sen S (2009) A Guide to QTL Mapping with R/qtl. Springer, NewYork Comadran J, Thomas W, van Eeuwijk F, Ceccarelli S, Grando S, Stanca A, Pecchioni N, Akar T, Al-Yassin A, Benbelkacem A, Ouabbou H, Bort J, Romagosa I, Hackett C, Russell J (2009) Patterns of genetic diversity and linkage disequilibrium in a highly structured Hordeum vulgare association-mapping population for the Mediterranean basin. Theor Appl Genet 119:175-187 Milne et al., (2010) Flapjack - graphical genotype visualization. Bioinformatics 26(24), 3133-3134.

See Also

qtl.analysis, qtl.memq

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
data (SxM_geno)
data (SxM_map)
data (SxM_pheno)

P.data <- SxM_pheno
G.data <- SxM_geno
map.data <- SxM_map

cross.data <- qtl.cross (P.data, G.data, map.data,
cross='dh', heterozygotes = FALSE)
summary (cross.data)

kbroman/lmem.qtler documentation built on May 30, 2019, 3:10 p.m.