R/mpp_SIM.R

Defines functions mpp_SIM

Documented in mpp_SIM

###########
# mpp_SIM #
###########

#' MPP Simple Interval Mapping
#' 
#' Computes single QTL models along the genome using different models.
#' 
#' The implemented models vary according to the number of alleles assumed at the
#' QTL position and their origin. Four assumptions for the QTL effect are
#' possible.
#' 
#' Concerning the type of QTL effect, the first option is a cross-specific QTL
#' effects model (\code{Q.eff = "cr"}). In this model, the QTL effects are
#' assumed to be nested within cross which leads to the estimation of one
#' parameter per cross. The cross-specific model corresponds to the
#' disconnected model described in Blanc et al. 2006.
#' 
#' A second possibility is the parental model (\code{Q.eff = "par"}). The
#' parental model assumes one QTL effect (allele) per parent that are independent
#' from the genetic background. This means that QTL coming form parent i has the
#' same effect in all crosses where this parent is used. This model is supposed
#' to produce better estimates of the QTL due to larger sample size when parents
#' are shared between crosses.
#' 
#' In a connected MPP (\code{\link{design_connectivity}}), if np - 1 < nc, where
#' np is the number of parents and nc the number of crosses, the parental model
#' should be more powerful than the cross-specific model because it estimate
#' a reduced number of QTL parameters. This gain in power will be only true if
#' the assumption of constant parental effect through crosses holds. Calculated
#' with HRT assumption, the parental model corresponds to the connected model
#' presented in Blanc et al. (2006).
#' 
#' The third type of model is the ancestral model (\code{Q.eff = "anc"}). This
#' model tries to use genetic relatedness that could exist between parents.
#' Indeed, the parental model assumes that parent are independent which is not
#' the case. Using genetic relatedness between the parents, it is possible group
#' these parents into a reduced number of ancestral cluster. Parents belonging
#' to the same ancestral group are assumed to transmit the same allele
#' (Jansen et al. 2003; Leroux et al. 2014). The ancestral model estimate
#' therefore one QTL effect
#' per ancestral class. Once again, the theoretical expectation is a gain of
#' QTL detection power by the reduction of the number of parameters to estimate.
#' The HRT ancestral model correspond to the linkage desequilibrium
#' linkage analysis (LDLA) models used by Bardol et al. (2013) or
#' Giraud et al. (2014).
#' 
#' The final possibility is the bi-allelic model (\code{Q.eff = "biall"}).
#' Bi-allelic genetic predictor are a single vector with value 0, 1 or 2
#' corresponding to the number of allele copy of the least frequent SNP allele.
#' Relatedness between lines is therefore defined via identical by state (IBS)
#' measurement. This model corresponds to models used for association mapping.
#' For example, it is similar to model B in Wurschum et al. (2012) or
#' association mapping model in Liu et al. (2012).
#' 
#' @param mppData An object of class \code{mppData}.
#' 
#' @param trait \code{Numerical} or \code{character} indicator to specify which
#' trait of the \code{mppData} object should be used. Default = 1.
#'
#' @param Q.eff \code{Character} expression indicating the assumption concerning
#' the QTL effects: 1) "cr" for cross-specific; 2) "par" for parental; 3) "anc"
#' for ancestral; 4) "biall" for a bi-allelic. For more details see
#' \code{\link{mpp_SIM}}. Default = "cr".
#' 
#' @param plot.gen.eff \code{Logical} value. If \code{plot.gen.eff = TRUE},
#' the function will save the decomposed genetic effects per cross/parent.
#' These results can be plotted with the function \code{\link{plot.QTLprof}}
#' to visualize a genome-wide decomposition of the genetic effects.
#' \strong{This functionality is ony available for the cross-specific,
#' parental and ancestral models.}
#' Default value = FALSE.
#' 
#' @param n.cores \code{Numeric}. Specify here the number of cores you like to
#' use. Default = 1.
#' 
#'   
#' @return Return:
#' 
#' \item{SIM }{\code{Data.frame} of class \code{QTLprof}. with five columns :
#' 1) QTL marker names; 2) chromosomes;
#' 3) interger position indicators on the chromosome;
#' 4) positions in centi-Morgan; and 5) -log10(p-val). And if
#' \code{plot.gen.eff = TRUE}, p-values of the cross or parental QTL effects.}
#' 
#' @author Vincent Garin
#' 
#' @seealso \code{\link{plot.QTLprof}}
#' 
#' @references
#' 
#' Bardol, N., Ventelon, M., Mangin, B., Jasson, S., Loywick, V., Couton, F., ...
#' & Moreau, L. (2013). Combined linkage and linkage disequilibrium QTL mapping 
#' in multiple families of maize (Zea mays L.) line crosses highlights
#' complementarities between models based on parental haplotype and single locus
#' polymorphism. Theoretical and applied genetics, 126(11), 2717-2736.
#' 
#' Blanc, G., Charcosset, A., Mangin, B., Gallais, A., & Moreau, L. (2006).
#' Connected populations for detecting quantitative trait loci and testing for
#' epistasis: an application in maize. Theoretical and Applied Genetics,
#' 113(2), 206-224. 
#' 
#' Giraud, H., Lehermeier, C., Bauer, E., Falque, M., Segura, V., Bauland,
#' C., ... & Moreau, L. (2014). Linkage Disequilibrium with Linkage Analysis
#' of Multiline Crosses Reveals Different Multiallelic QTL for Hybrid
#' Performance in the Flint and Dent Heterotic Groups of Maize. Genetics,
#' 198(4), 1717-1734.
#' 
#' Jansen, R. C., Jannink, J. L., & Beavis, W. D. (2003). Mapping quantitative
#' trait loci in plant breeding populations. Crop Science, 43(3), 829-834.
#' 
#' Leroux, D., Rahmani, A., Jasson, S., Ventelon, M., Louis, F., Moreau, L.,
#' & Mangin, B. (2014). Clusthaplo: a plug-in for MCQTL to enhance QTL detection
#' using ancestral alleles in multi-cross design. Theoretical and Applied
#' Genetics, 127(4), 921-933.
#' 
#' Liu, W., Reif, J. C., Ranc, N., Della Porta, G., & Wurschum, T. (2012).
#' Comparison of biometrical approaches for QTL detection in multiple
#' segregating families. Theoretical and Applied Genetics, 125(5), 987-998.
#' 
#' Meuwissen T and Luo, Z. (1992). Computing inbreeding coefficients in large
#' populations. Genetics Selection Evolution, 24(4), 305-313.
#' 
#' Wurschum, T., Liu, W., Gowda, M., Maurer, H. P., Fischer, S., Schechert, A.,
#' & Reif, J. C. (2012). Comparison of biometrical models for joint linkage
#' association mapping. Heredity, 108(3), 332-340. 
#' 
#' @examples
#' 
#' 
#' # Cross-specific model
#' ######################
#' 
#' data(mppData)
#' 
#' SIM <- mpp_SIM(mppData = mppData, Q.eff = "cr", plot.gen.eff = TRUE)
#' 
#' plot(x = SIM)  
#' plot(x = SIM, gen.eff = TRUE, mppData = mppData, Q.eff = "cr")
#' 
#' 
#' # Bi-allelic model
#' ##################
#' 
#' SIM <- mpp_SIM(mppData = mppData, Q.eff = "biall")
#' 
#' plot(x = SIM, type = "h")
#' 
#' @export
#' 


mpp_SIM <- function(mppData, trait = 1, Q.eff = "cr",
                    plot.gen.eff = FALSE, n.cores = 1) {
  
  # 1. Check data format and arguments
  ####################################
  
  check.model.comp(mppData = mppData, trait = trait, Q.eff = Q.eff,
                   VCOV = 'h.err', plot.gen.eff = plot.gen.eff,
                   n.cores = n.cores, fct = "SIM")
  
  # 2. Form required elements for the analysis
  ############################################
  
  ### 2.1 trait values
  
  t_val <- sel_trait(mppData = mppData, trait = trait)
  
  ### 2.3 cross matrix (cross intercept)
  
  cross.mat <- IncMat_cross(cross.ind = mppData$cross.ind)
  
  ### 2.4 Optional cluster
  
  if(n.cores > 1){
    
    parallel <- TRUE
    cluster <- makeCluster(n.cores)
    
  } else {
    
    parallel <- FALSE
    cluster <- NULL
    
  }
  
  vect.pos <- 1:dim(mppData$map)[1]
  
  # 3. computation of the SIM profile (genome scan)
  #################################################
  
  if (parallel) {
    
    log.pval <- parLapply(cl = cluster, X = vect.pos, fun = QTLModelSIM,
                          mppData = mppData, trait = t_val,
                          cross.mat = cross.mat, Q.eff = Q.eff, VCOV = 'h.err',
                          plot.gen.eff = plot.gen.eff)
    
  } else {
    
    log.pval <- lapply(X = vect.pos, FUN = QTLModelSIM,
                       mppData = mppData, trait = t_val, cross.mat = cross.mat,
                      Q.eff = Q.eff, VCOV = 'h.err', plot.gen.eff = plot.gen.eff)
    
  }
  
  if(n.cores > 1){stopCluster(cluster)}
  
  log.pval <- t(data.frame(log.pval))
  if(plot.gen.eff){log.pval[is.na(log.pval)] <- 1}
  log.pval[, 1] <- check.inf(x = log.pval[, 1]) # check if there are -/+ Inf value
  log.pval[is.na(log.pval[, 1]), 1] <- 0
  
  
  # 4. form the results
  #####################
  
  SIM <- data.frame(mppData$map, log.pval)
  
  if(plot.gen.eff){
    
    if(Q.eff == "cr"){ Qeff_names <- unique(mppData$cross.ind)
    
    } else { Qeff_names <- mppData$parents }
    
    colnames(SIM)[5:dim(SIM)[2]] <- c("log10pval", Qeff_names)
    
  } else {colnames(SIM)[5] <- "log10pval"}
  
  
  class(SIM) <- c("QTLprof", "data.frame")
  
  ### 4.1: Verify the positions for which model could not be computed
  
  if(sum(SIM$log10pval == 0) > 0) {
    
    if (sum(SIM$log10pval) == 0){
      
      warning("the computation of the QTL model failled for all positions")
      
    } else {
      
      list.pos <- mppData$map[(SIM$log10pval == 0), 1]
      
      prob_pos <- paste(list.pos, collapse = ", ")
      
      message("the computation of the QTL model failed for the following ",
              "positions: ", prob_pos,
              ". This could be due to singularities or function issues")
      
    }
    
  }
  
  return(SIM)
  
}
vincentgarin/mppR documentation built on March 13, 2024, 7:30 p.m.