R/ecogen.2OF6.constructor.R
In EcoGenetics: Management and Exploratory Analysis of Spatial Data in Landscape Genetics

################################################
#### ECOGEN CONSTRUCTOR
################################################

#' Creating a new ecogen object
#' @param XY Data frame with m columns (coordinates) and n rows (individuals).
#' @param P Data frame with n rows (individuals), and phenotypic data in columns.
#' @param G Data of class: "data.frame", with individuals in rows and genotypic data 
#' in columns (loci). The ploidy and the type (codominant, dominant) of the data, 
#' must be passed with the arguments "ploidy" and "type". Missing data is coded as NA.
#' Dominant data must be coded with binary values (0 for absence - 1 for presence).
#' @param E Data frame with n rows (individuals), and environmental data 
#' in columns.
#' @param S Data frame with n rows (individuals), and groups (factors) in columns.
#' The program converts non-factor data into factor.
#' @param C Data frame with n rows (individuals), and custom variables in columns.
#' @param G.processed If TRUE, the slot G will include a processed data frame:
#' removed non informative loci  (the data non available for all the individuals),
#' or non polymorphic loci (for dominant data).
#' @param order.G Genotypes must be ordered in G slot? (codominant data) 
#' Default FALSE. If true alleles are ordered in ascending order. 
#' @param ploidy Ploidy of the G data frame. Default ploidy = 2.
#' @param type Marker type: "codominant" or "dominant".
#' @param sep Character separating alleles (codominant data). 
#' Default option is no character separating alleles. 
#' @param ncod Number of characters coding each allele (codominant data).
#' @param missing Missing data treatment ("NA", "0", or "MEAN") for the A
#' slot. Missing elements are set to NA in the default option. missing elements
#' are recoded as 0 or the mean allelic frequency across individuals in "0" 
#' and "MEAN" options, respectively. 
#' @param NA.char Character simbolizing missing data in the input. Default is "NA".
#' @param poly.level Polymorphism threshold in percentage (0 - 100), 
#' for remotion of non polymorphic loci (for dominant data). Default is 5 (5\%).
#' @param rm.empty.ind Remotion of noinformtive individuals (row of "NAs").
#' Default if FALSE. This option is only available when the 'lock.rows' parameter is FALSE.
#' @param order.df Order individuals of data frames by row? (all data frames with a same order in row names).
#'  This option is only available when the 'lock.rows' parameter is TRUE. 
#' If the names of the data frames are not used (i.e., set.names and valid.names are not NULL),
#' setting this parameter to TRUE/FALSE has no effect in the function. 
#' Defalut TRUE. If FALSE, the row names of all the data frames must be ordered. The use of data frames 
#' with row names in different order will return an error.
#' In both cases, the program sets an internal names attribute of the object
#' using the row names of the first non-empty data frame found in the following order: 
#' XY, P, G, E, S, C. This attribute is used as reference to order rows when order.df = TRUE. 
#' @param set.names Character vector with names for the rows of the non-empty data frames. 
#' This argument is incompatible with valid.names
#' @param valid.names Logical. Create valid row names? This argument is incompatible with 
#' set.names. The program will name individuals with valid tags I.1, I.2, etc.
#' @param lock.rows Turn on row names check. Data frames require indentical individuals in rows.
#' Default TRUE.

#' @details This is a generic function for creation of ecogen objects.
#' In the default option, missing data should be coded as "NA", but any missing 
#' data character can be passed with the option NA.char. 
#' In all the cases, the new object will have a slot G coding the missing data as NA. 
#' For dominant markers (0/1 coding), the slot A is unnecesary an it is treated
#' by ecogen methods as a symbolic link to G.
#' 
#' 
#' \strong{ACCESS TO THE SLOTS. MODIFICATION OF ECOGEN OBJECTS}
#' 
#' The content of the slots can be extracted with the corresponding accessors
#' ecoslot.XY, ecoslot.P, ecoslot.G, ecoslot.A, ecoslot.E, ecoslot.C and ecoslot.OUT. 
#' Accessors can be also used to assign data to the slots.
#' The correct use of ecogen objects requires the implementation of accessors, 
#' as they ensure the checking and pre-processing of the data. The use of accessors allows to
#' modify or fill the slots of ecogen objects, without the need of creating a new
#' object each time. See \emph{help("EcoGenetics accessors")} for a detailed description 
#' and examples about ecogen accessors. 
#' 
#' \strong{OTHER SLOT ACCESS METHODS FOR ECOGEN OBJECTS}
#' 
#' The use of brackets is defined for ecogen objects:
#' 
#'  - Single bracket: the single bracket ("[") is used to subset all the ecogen
#'  data frames (P, G, E, S, A and C) by row, at once. The notation for an object
#'  is eco[from:to], where eco is any ecogen object, and from: to is the row
#'  range. For example: eco[1:10] , subsets the object eco from row 1 to row 10, 
#'  for all the data frames at once.
#'  
#' - Double square brackets: the double square brackets are symbolic abbreviations 
#' of the accessors (i.e., it is a call to the corresponding accessor). 
#' The usage is: eco[["X"]], where X is a slot: eco[["P"]], 
#' eco[["G"]], eco[["A"]], eco[["E"]], eco[["S"]], eco[["C"]] and eco[["OUT"]].
#' Double square brackets can be used in get/set mode. 
#' See Examples below and in help("EcoGenetics accessors").
#' 
#' 
#' \bold{ABOUT THE CONSTRUCTION OF NEW ECOGEN OBJECTS}
#' 
#' A new ecogen object can be constructed in two different ways. 
#' First, a new object can be created, incorporating all the information 
#' at once. Second, the data can be added in each slot, using the corresponding
#' accessor / "[[". Accessor/double square brackets methods allow temporal modification 
#' of any ecogen object and ensure the modularity of this kind of objets. 
#' These methods are not only functions used to get/assign values to the slots, 
#' they provide a basic pre-processing of the data during assignment, 
#' generating a coherent and valid set of information.
#' 
##' \bold{LOCKED AND UNLOCKED OBJECTS}
#' #' Starting from version 1.2.1.5, ecogen and ecopop objects can be "locked" 
#' (default) or "unlocked". A "locked" object must have identical number of rows and 
#' row names in all the input data frames (or a rule must be provided to construct
#' the row names in case of ecogen objects, with valid.names or set.names arguments).
#' An unlocked objects allows to have a free number of rows
#' in each table, and row names do not need to coincide among tables. See 
#' examples below.
#' 
#' 
#' @author Leandro Roser \email{learoser@@gmail.com}
#' 
#' @examples
#' \dontrun{
#' 
#' # Example with G data of class "data.frame", corresponding to
#' # microsatellites of a diploid organism:
#' data(eco.test)
#' eco <- ecogen(XY = coordinates, P = phenotype, G = genotype,
#' E = environment, S = structure)
#' 
#' # Example with G data of class "data.frame", corresponding to a
#' # presence - absence molecular marker:
#' dat <- sample(c(0,1),100,rep = TRUE)
#' dat <- data.frame(matrix(dat,10,10))
#' eco <- ecogen(G = dat, type = "dominant")
#' 
#' 
#' # DINAMIC ASSIGNMENT WITH ACCESSORS AND "[["
#' 
#' eco <- ecogen(XY = coordinates, P = phenotype)
#' eco
#' 
#' ecoslot.G(eco, order.G = TRUE) <- genotype
#' 
#' # this is identical to
#' eco[["G", order.G=TRUE]] <- genotype
#' 
#' ecoslot.E(eco) <- environment
#' 
#' # this is identical to
#' eco[["E"]] <- environment
#' 
#' #----------------------------------------------------------
#' # See additional examples in help("EcoGenetics accessors")
#' #----------------------------------------------------------
#' 
#' # Storing data in the slot OUT
#' 
#'  singers <- c("carlos_gardel", "billie_holiday")
#'  
#' ecoslot.OUT(eco) <- singers
#'  
#' # Storing several datasets
#'
#' golden.number <- (sqrt(5) + 1) / 2
#' ecoslot.OUT(eco) <- list(singers, golden.number)    # several objects must be passed as a list
#' 
#' # this is identical to:
#' 
#' eco[["OUT"]] <- list(singers, golden.number)
#' 
#' #---------------------------------
#' # Locked and unlocked objects
#' #---------------------------------
#' 
#' is.locked(eco) # check if object is locked
#' 
#' eco[["P"]] <- rbind(eco[["P"]], eco[["P"]]) # invalid 
#'                                             # in locked object
#' 
#' 
#' eco_unlocked <- eco.unlock(eco) #unlocked object
#' eco_unlocked[["P"]]<-rbind(eco[["P"]], eco[["P"]]) # valid now
#' 
#' new_locked <- eco.lock(eco_unlocked) # invalid
#' eco_unlocked[["P"]]<- eco[["P"]]
#' new_locked <- eco.lock(eco_unlocked) # valid now
#' 
#' }
#' 
#' @author Leandro Roser \email{learoser@@gmail.com}
#' @export ecogen


setGeneric("ecogen",      	 
           function(XY = data.frame(),
                    P = data.frame(),
                    G = data.frame(), 
                    E = data.frame(),
                    S = data.frame(),
                    C = data.frame(),
                    G.processed = TRUE,
                    order.G = FALSE,
                    ploidy = 2,
                    type = c("codominant", "dominant"),
                    sep = "", 
                    ncod = NULL,
                    missing = c("NA", "0", "MEAN"),
                    NA.char = "NA", 
                    poly.level = 5,
                    rm.empty.ind = FALSE,
                    order.df = TRUE,
                    set.names = NULL,
                    valid.names = FALSE,
                    lock.rows = TRUE) {				
             
          
             # general configuration
             type <- tolower(type)
             type <- match.arg(type)
             missing <- toupper(as.character(missing))
             missing <- match.arg(missing)

             # names configuration
             if(!is.null(set.names) && valid.names) {
               stop("incompatible arguments: only one of <valid.names, set.names> can be set")
             }
             
             # creating a new ecogen object
             object <- new("ecogen")
             # set object environment
             object@ATTR$whereIs <- parent.frame()
             object@ATTR$.call <- match.call()
             
             # G configuration-------------------------------------------------#
             
             if(any(dim(G) == 0)) { # empty G
               object@G <- data.frame()
               object@A <- matrix(nrow = 0, ncol = 0)
               object@INT <- new("int.gendata")
               
               
             } else { # non empty G
               
               ## coherence between data ploidy and ncod is checked for int.df2genind
               
               temporal_int_genind <- int.df2genind(G, 
                                      sep = sep, 
                                      ncod =  ncod,
                                      NA.char = NA.char, 
                                      ploidy = ploidy, 
                                      type = type,
                                      missing = missing,
                                      rm.empty.ind = rm.empty.ind,
                                      poly.level = poly.level,
                                      lock.rows = lock.rows)
             
               # unfolding temporal_int_genind ------------------
              
               ## if marker type is "dominant", A is a pointer to G for assignments
               ## and extraction methods, and the slot is empty
               if(type == "codominant") {
                 
               # matrix is lighter than data frame. LR 9/12/2016
               object@A <- temporal_int_genind@tab
               }  else {
               object@G <- as.data.frame(temporal_int_genind@tab)
               }
 
               object@INT <- int.genind2gendata(temporal_int_genind)
               
               ncod <- temporal_int_genind@ncod
               ploidy <- temporal_int_genind@ploidy
               
               # G processed case ~-~-~-~-~~-~-~-~-~
               if(G.processed) {
                 tmp <- int.genind2df(temporal_int_genind, sep = sep, NA.char = NA.char)
                 # order data
                 if(order.G && type == "codominant") {
                   tmp <- aue.sort(tmp, 
                                   ncod = ncod,
                                   ploidy = ploidy, 
                                   sep.loc = sep,
                                   chk.plocod = FALSE)
                 } 
                 
                 # G processed data frame
                 G <- as.data.frame(tmp)
                 
                 # G changes messages 
                 if(dim(tmp)[1] != dim(G)[1]) {
                   message("Note: removed noninformative individuals in slot G")
                 }
                 if(dim(tmp)[2] != dim(G)[2]) {
                   message("Note: removed noninformative loci in slot G")
                 }
                 if(order.G) {
                   message("Note: ordered genotypes in slot G")
                 }
               } 
               
               # fill now the G slot for
               object@G <-  G
             }
             
             
             
             # fill the other slots--------------------------------------------
             
             object@XY <- as.data.frame(XY)
             object@P <- as.data.frame(P)
             object@E <- as.data.frame(E)
             
             # all S columns as factors
             S <- as.data.frame(S)
             if(dim(S)[1] != 0)  S[] <- lapply(S, factor)
               # better the above way. 2016/01/04 L.R.
               # 'factor' is better than 'as.factor' because
               # it drops unused levels.
              #   for(i in 1:(ncol(S))) {
            #     S[, i] <- factor(S[, i])
            #   }
            
             
             object@S <- S
             object@C <- as.data.frame(C)
             
             if(!lock.rows) {
               object@ATTR$names <- character(0)
               object@ATTR$lock.rows <- FALSE
             } else {
               
               object.names <- list(XY=rownames(object@XY), 
                                    P=rownames(object@P), 
                                    G=rownames(object@G), 
                                    A=rownames(object@A),
                                    E=rownames(object@E),
                                    S=rownames(object@S), 
                                    C=rownames(object@C))
             
             # set names--------------------------------------------
             # case: use data frames names---->
             if(is.null(set.names) && !valid.names) {
               
               while(TRUE) {
                 
                 if(nrow(object@XY) != 0) {
                   object@ATTR$names <- object.names$XY
                   break
                 }
                 if(nrow(object@P) != 0) {
                   object@ATTR$names <- object.names$P
                   break
                 }
                 if(nrow(object@G) != 0) {
                   object@ATTR$names <- object.names$G
                   break
                 }
                 if(nrow(object@E) != 0) {
                   object@ATTR$names <- object.names$E
                   break
                 }
                 if(nrow(object@S) != 0) {
                   object@ATTR$names <- object.names$S
                   break
                 }
                 if(nrow(object@C) != 0) {
                   object@ATTR$names <- object.names$C
                   break
                 }
                 object@ATTR$names <- character(0)
                 break
               }
               
             # order rows
             if(order.df && lock.rows) {
               object <- int.order(object)
             } else if(order.df && !lock.rows) {
               message("Note: data frames will not be sorted by row in an unlock object\n")
             }
               
            # case: use set.names or valid.names---->
             } else {
               # use nrow method
               
               object.names <- object.names[unlist(lapply(object.names,
                                                          function(x) length(x)  != 0))]
               
               if(length(object.names) != 0) {
                 rownumber <- unique(unlist(lapply(object.names, length)))
                 # check nrow consistency
                 if(length(rownumber)> 1) {
                   stop("Non unique row number found")
                 }
                 
                 # set.names case --
                 if(!is.null(set.names)) {
  
                    #check length consistency
                   if(length(set.names) != rownumber) {
                     stop("the length of valid.names do not match 
                          with the number of rows in the object")
                   }
                   
                   the.names <- set.names
                   
                   # valid.names case --
                 } else if(valid.names) {
                   the.names <- paste0("I.", seq_len(rownumber))
                 }
                  
                 # set data frames names and object names --
                 for(i in names(object.names)) {
                  eval(expr = parse(text=paste0("rownames(object@", i, ") <- the.names")))
                 }

                 object@ATTR$names <- the.names
                 
               } 
             } # end set names
               
             }
              
             # check validity 
             validObject(object)
             
             object
             
           })
Any scripts or data that you put into this service are public.
EcoGenetics documentation built on July 8, 2020, 5:46 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
EcoGenetics
Management and Exploratory Analysis of Spatial Data in Landscape Genetics

R/ecogen.2OF6.constructor.R
In EcoGenetics: Management and Exploratory Analysis of Spatial Data in Landscape Genetics

Try the EcoGenetics package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

EcoGenetics Management and Exploratory Analysis of Spatial Data in Landscape Genetics

R/ecogen.2OF6.constructor.R In EcoGenetics: Management and Exploratory Analysis of Spatial Data in Landscape Genetics

Try the EcoGenetics package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

EcoGenetics
Management and Exploratory Analysis of Spatial Data in Landscape Genetics

R/ecogen.2OF6.constructor.R
In EcoGenetics: Management and Exploratory Analysis of Spatial Data in Landscape Genetics