R/qtl.cross.R
In lmem.qtler: Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

#########################################
#' Read genomic data to perform QTL analyses.
#'
#'
#' This function reads genomic data and is similar to the read.cross
#' function from r/qtl package (Broman and Sen, 2009) but allows
#' importing data from a flapjack format (Milne et al., 2010).
#' The files required include a file containing phenotypic
#' information (P.data), a file containing genotypic information (G.data),
#' and a file containing map information (map.data) for all markers.
#'
#' @usage qtl.cross (P.data = NULL, G.data, map.data, cross, heterozygotes = TRUE,
#'                   sep = "\t" )
#'
#' @param P.data Name of the file containing phenotypic information.
#' Each row represents the individuals while each column represents
#' the phenotypic traits. The first column should be labeled as 'genotype'
#' and should contain identification name for each individual.
#' The name of each trait should also be included.
#'
#' @param G.data Name of the file containing genotypic (marker scores)
#' information. Each row represents the individuals
#' while each column represents the markers. Headers for markers should be
#' included, but not for genotypes.
#' The first column contains the names of the genotypes.
#' The first row contains the names of the markers.
#' The marker genotypes are coded by two characters corresponding
#' to the alleles using a separator between alleles (by default a slash /).
#' If a single character is given, the genotype is assumed to be homozygous.
#' Missing values are indicated by default with '-'.
#' In the example below, the two alleles have been called 1 and 2 because it
#' is useful to link alleles to their origin, i.e. parent 1 or parent 2.
#' Therefore, 1 corresponds to homozygous for allele 1 (synonymous to 1/1),
#' 1/2 corresponds to heterozygous, and 2 corresponds to homozygous
#' for allele 2 (synonymous to 2/2).
#' In the case of partially informative markers (e.g. dominant markers)
#' genotypes are coded as 1/- or 2/-, depending on whether the dominant
#' allele originated from parent 1 or parent 2.
#'
#' @param  map.data Name of the file containing marker map information
#' (i.e. linkage group and position within linkage group).
#' The file is a text tab delimited file. Each row represents markers.
#' The file consists of three columns.
#' Column 1 gives the marker names,
#' column 2 the chromosome on which the marker has been mapped,
#' and column 3 indicates the position of the marker within the chromosome.
#'
#' @param cross The type of population studied. The type of population studied.
#' Options are: F2 (f2), doubled haploids (dh), backcross (bc),
#' recombinant inbred lines from selfing (riself, ri4self, or ri8self
#' depending on the number of parents used), recombinant inbred lines from
#' sib-mating (risib, ri4sib, or ri8sib depending on the number of parents
#' used), segregating F1 cross-pollinated populations (cp),
#'
#' @param heterozygotes It indicates whether there are heterozygotes or
#' not in the association mapping population. FALSE is set as default.
#'
#' @param sep To define the espace between the data.
#'
#' @return Creates an object of class cross to be used in QTL analysis.
#' The components are the same as r/qtl (Broman and Sen, 2009):
#' geno  This is a list with elements corresponding to chromosomes.names (geno)
#' contains the names of the chromosomes. Each chromosome is itself a list,
#' and is given class A or X according to whether it is autosomal or
#' the X chromosome.There are two components for each chromosome:
#' data, a matrix whose rows are individuals and whose columns are markers,
#' and map, either a vector of marker positions (in cM) or a matrix of dim
#' (2 x n.mar) where the rows correspond to marker positions in female and male
#' genetic distance, respectively.The genotype data gets converted into numeric
#' codes, as follows.
#' The genotype data for a backcross is coded as NA = missing, 1 = AA, 2 = AB.
#' For an F2 intercross, the coding is NA = missing, 1 = AA, 2 = AB, 3 = BB,
#' 4 = not BB (i.e. AA or AB; D in Mapmaker/qtl), 5 = not AA (i.e. AB or BB;
#' C in Mapmaker/qtl).
#' For a 4-way cross, the mother and father are assumed to have genotypes AB
#' and CD, respectively. The genotype data for the progeny is assumed to be
#' phase-known, with the following coding scheme: NA = missing, 1 = AC, 2 = BC,
#' 3 = AD, 4 = BD, 5 = A = AC or AD, 6 = B = BC or BD, 7 = C = AC or BC,
#' 8 = D = AD or BD, 9 = AC or BD, 10 = AD or BC, 11 = not AC, 12 = not BC,
#' 13 = not AD, 14 = not BD.
#' pheno   a data.frame of size (n.ind x n.phe) containing the phenotypes.
#' If a phenotype with the name genotype  is included, these identifiers will
#' be used in top.errorlod, plotErrorlod, and plotGeno as identifiers for the
#' individual.
#'
#' @references Broman KW, Sen S (2009) A Guide to QTL Mapping with R/qtl.
#'             Springer, NewYork
#'             Comadran J, Thomas W, van Eeuwijk F, Ceccarelli S, Grando S, Stanca A,
#'             Pecchioni N, Akar T, Al-Yassin A, Benbelkacem A, Ouabbou H, Bort J,
#'             Romagosa I, Hackett C, Russell J (2009) Patterns of genetic diversity
#'             and linkage disequilibrium in a highly structured Hordeum vulgare
#'             association-mapping population for the Mediterranean basin.
#'             Theor Appl Genet 119:175-187
#'             Milne et al., (2010) Flapjack - graphical genotype visualization.
#'             Bioinformatics 26(24), 3133-3134.
#'
#' @author Lucia Gutierrez.
#'
#' @details The function creates an intermediate file called 'temp.csv' and
#' then uses the read.cross from r/qtl to read it.
#' The output object is an object of class=cross, the same as the
#' one produced by the function read.cross in r/qtl (Broman and Sen, 2009)
#'
#' @note All functions in this package uses cross data style.
#'
#' @seealso qtl.analysis, qtl.memq
#'
#' @import qtl
#' @import stringr
#' @import graphics
#' @import utils
#' @import grDevices
#' @import stats
#'
#' @export
#'
#' @examples
#' data (SxM_geno)
#' data (SxM_map)
#' data (SxM_pheno)
#'
#' P.data <- SxM_pheno
#' G.data <- SxM_geno
#' map.data <- SxM_map
#'
#' cross.data <- qtl.cross (P.data, G.data, map.data,
#' cross='dh', heterozygotes = FALSE)
#' summary (cross.data)
#'
qtl.cross <- function (P.data = NULL, G.data, map.data, cross = "dh",
  heterozygotes = TRUE, sep = "\t") {
    cross.data <- NULL

    G.data <- as.matrix(G.data)
    G.data <- G.data[3:nrow(G.data),]
    G.data[G.data == " 2"] <- "2"
    G.data[G.data == " 1"] <- "1"

    G.data <- G.data[which(G.data[, 1] %in% as.character(P.data$genotype)), ]
    P.data <- P.data[which(as.character(P.data$genotype) %in% G.data[, 1]), ]
    rownames(G.data) <- G.data[, 1]
    G.data <- G.data[, 2:ncol(G.data)]

    G.data <- G.data[, which(colnames(G.data) %in% map.data[, 1])]
    map.data <- map.data[which(map.data[, 1] %in% colnames(G.data)), ]
    G.data <- G.data[, match(map.data[, 1], colnames(G.data))]

    P.data <- P.data[match(rownames(G.data), as.character(P.data$genotype)), ]

    a <- matrix(rep("", (ncol(P.data) * 2)), 2, ncol(P.data))

    colnames(a) <- colnames(P.data)

    names(a) <- names(P.data)

    b <- data.frame(t(map.data[, 2:3]))

    names(b) <- as.character(map.data[, 1])

    c <- data.frame(a, b, check.names = FALSE, stringsAsFactors = FALSE)

    rownames(c) <- c(rownames(a), rownames(b))

    # extract id information for geno and pheno
    G.id <- row.names(G.data)
    P.id <- as.character(P.data$genotype)

    if (sum(G.id != P.id) > 0) {
        simpleError("IDs don't match")
    }

    geno.data <- G.data

    gens <- c(unlist(geno.data))
    gens <- sort(unique(gens[gens != "-"]))

    als <- unique(c(unlist(strsplit(gens, "/"))))

    Het <- gens[which(gens == paste(als[1], als[2],
      sep = "/") | gens == paste(als[2], als[1],
        sep = "/"))]
    if (length(Het) > 0) {
        hetpos <- grep(Het, geno.data)
        Apos <- setdiff(grep(als[1], geno.data), hetpos)
        Bpos <- setdiff(grep(als[2], geno.data), hetpos)
        geno.data[hetpos] <- "1/2"
    }
    if (length(Het) == 0) {
        Apos <- grep(als[1], geno.data)
        Bpos <- grep(als[2], geno.data)
    }
    geno.data[Apos] <- "1/1"
    geno.data[Bpos] <- "2/2"
    geno.data[geno.data == "1/1"] <- "AA"
    geno.data[geno.data == "2/2"] <- "BB"
    geno.data[geno.data == "1/2"] <- "AB"
    geno.data[geno.data == "1/-"] <- "C"
    geno.data[geno.data == "2/-"] <- "D"
    geno.data[geno.data == "-"] <- NA

    e <- cbind(P.data, geno.data)

    f <- rbind(c, e)

    names(f)[1] <- "id"

    zz <- file("cross.data.file.csv", "w")

    write.csv(f, zz, quote = FALSE, row.names = FALSE)

    close(zz)

    # this is to import the cross

    if(cross == "dh"){
      data.prueba <- read.cross("csv", file = "cross.data.file.csv",
        genotypes = c("AA", "BB"))
    } #check if more

    if(cross == "bc"){
      data.prueba <- read.cross("csv", file = "cross.data.file.csv",
        genotypes = c("AA", "AB"))
    } #check if more

    if(cross == "f2"){
      data.prueba <- read.cross("csv", file = "cross.data.file.csv",
        genotypes = c("AA", "AB", "BB", "C", "D"))
    }

    if(cross == "riself" | cross == "ri4self" | cross == "ri8self"){
      data.prueba <- read.cross("csv", file="cross.data.file.csv",
        genotypes=c("AA", "BB", "C", "D"))
    }

    if(cross == "risib" | cross == "ri4sib" | cross == "ri8sib"){
      data.prueba <- read.cross("csv", file = "cross.data.file.csv",
        genotypes = c("AA", "BB", "C", "D"))
    }

    data.prueba$gwas <- "QTL"

    cross.data <- data.prueba

    class(cross.data)[1] <- paste(cross)

    cross.data
}

Any scripts or data that you put into this service are public.

lmem.qtler documentation built on May 2, 2019, 1:46 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lmem.qtler
Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

R/qtl.cross.R
In lmem.qtler: Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

Try the lmem.qtler package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

lmem.qtler Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

R/qtl.cross.R In lmem.qtler: Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

Try the lmem.qtler package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

lmem.qtler
Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis

R/qtl.cross.R
In lmem.qtler: Linear Mixed Effects Models for QTL Mapping for Multienvironment and Multitrait Analysis