R/gl.report.heterozygosity.r
In dartR: Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

Documented in gl.report.heterozygosity

#' @name gl.report.heterozygosity
#' @title Reports observed, expected and unbiased heterozygosities and FIS
#' (inbreeding coefficient) by population or by individual from SNP data
#' @description Calculates the observed, expected and unbiased expected (i.e.
#' corrected for sample size) heterozygosities and FIS (inbreeding coefficient)
#' for each population or the observed heterozygosity for each individual in a
#' genlight object.
#'
#' @param x Name of the genlight object containing the SNP [required].
#' @param method Calculate heterozygosity by population (method='pop') or by
#' individual (method='ind') [default 'pop'].
#' @param n.invariant An estimate of the number of invariant sequence tags used
#' to adjust the heterozygosity rate [default 0].
#' @param plot.out Specify if plot is to be produced [default TRUE].
#' @param plot_theme Theme for the plot. See Details for options
#' [default theme_dartR()].
#' @param plot_colors_pop A color palette for population plots or a list with
#' as many colors as there are populations in the dataset
#' [default discrete_palette].
#' @param plot_colors_ind List of two color names for the borders and fill of
#' the plot by individual [default two_colors].
#' @param save2tmp If TRUE, saves any ggplots and listings to the session
#' temporary directory (tempdir) [default FALSE].
#' @param verbose Verbosity: 0, silent or fatal errors; 1, begin and end; 2,
#' progress log; 3, progress and results summary; 5, full report
#' [default NULL, unless specified using gl.set.verbosity].
#'
#' @details
#' Observed heterozygosity for a population takes the proportion of
#' heterozygous loci for each individual then averages over the individuals in
#' that population. The calculations take into account missing values.
#'
#' Expected heterozygosity for a population takes the expected proportion of
#' heterozygotes, that is, expected under Hardy-Weinberg equilibrium, for each
#' locus, then averages this across the loci for an average estimate for the
#' population.
#'
#' Observed heterozygosity for individuals is calculated as the proportion of
#' loci that are heterozygous for that individual.
#'
#' Finally, the loci that are invariant across all individuals in the dataset
#' (that is, across populations), is typically unknown. This can render
#' estimates of heterozygosity analysis specific, and so it is not valid to
#' compare such estimates across species or even across different analyses. This
#' is a similar problem faced by microsatellites. If you have an estimate of the
#' number of invariant sequence tags (loci) in your data, such as provided by
#' \code{\link{gl.report.secondaries}}, you can specify it with the n.invariant
#' parameter to standardize your estimates of heterozygosity.
#'
#' \strong{NOTE}: It is important to realise that estimation of adjusted
#' heterozygosity requires that secondaries not to be removed.
#'
#' Heterozygosities and FIS (inbreeding coefficient) are calculated by locus
#' within each population using the following equations:
#' \itemize{
#' \item Observed heterozygosity (Ho) = number of homozygotes / n_Ind,
#' where n_Ind is the number of individuals without missing data.
#' \item Observed heterozygosity adjusted (Ho.adj) <- Ho * n_Loc /
#'  (n_Loc + n.invariant),
#' where n_Loc is the number of loci that do not have all missing data  and
#' n.invariant is an estimate of the number of invariant loci to adjust
#' heterozygosity.
#' \item Expected heterozygosity (He) = 1 - (p^2 + q^2),
#' where p is the frequency of the reference allele and q is the frequency of
#' the alternative allele.
#' \item Expected heterozygosity adjusted (He.adj) = He * n_Loc /
#' (n_Loc + n.invariant)
#' \item Unbiased expected heterozygosity (uHe) = He * (2 * n_Ind /
#' (2 * n_Ind - 1))
#' \item Inbreeding coefficient (FIS) = 1 - (mean(Ho) / mean(uHe))
#' }
#'
#'\strong{ Function's output }
#'
#' Output for method='pop' is an ordered barchart of observed heterozygosity,
#' unbiased expected heterozygosity and FIS (Inbreeding coefficient) across populations
#' together with a table of mean observed and expected heterozygosities and FIS
#' by population and their respective standard deviations (SD).
#' 
#' In the output, it is also reported by population: the number of loci used to
#'  estimate heterozygosity(nLoc), the number of polymorphic loci (polyLoc), 
#'  the number of monomorphic loci (monoLoc) and loci with all missing data
#'   (all_NALoc).
#'
#' Output for method='ind' is a histogram and a boxplot of heterozygosity across
#' individuals.
#'
#'  Plots and table are saved to the session temporary directory (tempdir)
#'
#'  Examples of other themes that can be used can be consulted in \itemize{
#'  \item \url{https://ggplot2.tidyverse.org/reference/ggtheme.html} and \item
#'  \url{https://yutannihilation.github.io/allYourFigureAreBelongToUs/ggthemes/}
#'  }
#'
#' @return A dataframe containing population labels, heterozygosities, FIS,
#' their standard deviations and sample sizes
#'
#' @author Custodian: Luis Mijangos (Post to
#' \url{https://groups.google.com/d/forum/dartr})
#'
#' @examples
#'  \donttest{
#' require("dartR.data")
#' df <- gl.report.heterozygosity(platypus.gl)
#' df <- gl.report.heterozygosity(platypus.gl,method='ind')
#' n.inv <- gl.report.secondaries(platypus.gl)
#' gl.report.heterozygosity(platypus.gl, n.invariant = n.inv[7, 2])
#' }
#' df <- gl.report.heterozygosity(platypus.gl)
#'
#' @seealso \code{\link{gl.filter.heterozygosity}}
#'
#' @family report functions
#' @export

gl.report.heterozygosity <- function(x,
                                     method = "pop",
                                     n.invariant = 0,
                                     plot.out = TRUE,
                                     plot_theme = theme_dartR(),
                                     plot_colors_pop = discrete_palette,
                                     plot_colors_ind = two_colors,
                                     save2tmp = FALSE,
                                     verbose = NULL) {
    # SET VERBOSITY
    verbose <- gl.check.verbosity(verbose)
    
    # FLAG SCRIPT START
    funname <- match.call()[[1]]
    utils.flag.start(func = funname,
                     build = "Jackson",
                     verbosity = verbose)
    
    # CHECK DATATYPE
    datatype <-
        utils.check.datatype(x, accept = "SNP", verbose = verbose)
    
    # FUNCTION SPECIFIC ERROR CHECKING
    
    if (!(method == "pop" | method == "ind")) {
        cat(
            warn(
                "Warning: Method must either be by population or by individual,
                set to method='pop'\n"
            )
        )
        method <- "pop"
    }
    
    if (n.invariant < 0) {
        cat(warn(
            "Warning: Number of invariant loci must be non-negative, set to 
            zero\n"
        ))
        n.invariant <- 0
        if (verbose == 5) {
            cat(
                report(
                    "  No. of invariant loci can be esimated using 
                    gl.report.secondaries\n"
                )
            )
        }
    }
    
    if (any(grepl(x@other$history, pattern = "gl.filter.secondaries") == TRUE) &
        n.invariant > 0) {
        cat(
            warn(
                "  Warning: Estimation of adjusted heterozygosity requires that 
                secondaries not to be removed. A gl.filter.secondaries call was 
                found in the history. This may cause the results to be 
                incorrect\n"
            )
        )
    }
    
    # DO THE JOB
    
    ########### FOR METHOD BASED ON POPULATIONS
    
    if (method == "pop") {
        # Set a population if none is specified (such as if the genlight object
        # has been generated manually)
        if (is.null(pop(x)) |
            is.na(length(pop(x))) | length(pop(x)) <= 0) {
            if (verbose >= 2) {
                cat(
                    warn(
                        "  No population assignments detected,
                             individuals assigned to a single population
                        labelled 'pop1'\n"
                    )
                )
            }
            pop(x) <- array("pop1", dim = nInd(x))
            pop(x) <- as.factor(pop(x))
        }
        
        # Split the genlight object into a list of populations
        sgl <- seppop(x)
        
        # OBSERVED HETEROZYGOSITY
        if (verbose >= 2) {
            cat(
                report(
                    "  Calculating Observed Heterozygosities, averaged across 
                    loci, for each population\n"
                )
            )
        }
        
        # Calculate heterozygosity for each population in the list
        # CP = Carlo Pacioni CP ###
        Ho.loc <-
            lapply(sgl, function(x)
                colMeans(as.matrix(x) == 1, na.rm = TRUE))
        ##########
        
        Ho <-
            unlist(lapply(sgl, function(x)
                mean(
                    colMeans(as.matrix(x) == 1, na.rm = TRUE), na.rm = TRUE
                )))
        
        ### CP ### observed heterozygosity standard deviation
        HoSD <-
            unlist(lapply(sgl, function(x)
                sd(
                    colMeans(as.matrix(x) == 1, na.rm = TRUE), na.rm = TRUE
                )))
        ##########
        
        # Calculate the number of loci that are not all NAs CP ###
        n_loc <-
            unlist(lapply(sgl, function(x)
                sum(!(
                    colSums(is.na(as.matrix(x))) == nrow(as.matrix(x))
                ))))
        ##########
        
        # calculate the number of polymorphic and monomorphic loci by population
        poly_loc <- NULL
        mono_loc <- NULL
        all_na_loc <- NULL
        
        for(y in 1:length(sgl)){

          y_temp <- sgl[[y]]
          hold <- y_temp
          mono_tmp <- gl.alf(y_temp)
          loc.list <- rownames(mono_tmp[which(mono_tmp$alf1==1 |
                                                mono_tmp$alf1 == 0),])
          loc.list_NA <- rownames(mono_tmp[which(is.na(mono_tmp$alf1)),])

          # Remove NAs from list of monomorphic loci and loci with all NAs
          loc.list <- loc.list[!is.na(loc.list)]

          # remove monomorphic loci and loci with all NAs
          if (length(loc.list) > 0) {
            y_temp <- gl.drop.loc(y_temp, loc.list = loc.list, verbose = 0)
          }

          poly_loc <-  c(poly_loc, nLoc(y_temp))  
          mono_loc <- c(mono_loc, (nLoc(hold) - nLoc(y_temp)))
          all_na_loc <- c(all_na_loc, length(loc.list_NA))

        }
     
        # Apply correction CP ###
        Ho.adj <- Ho * n_loc / (n_loc + n.invariant)
        # Manually compute SD for Ho.adj sum of the square of differences from
        # the mean for polymorphic sites plus sum of the square of differences
        #(which is the Ho.adj because Ho=0) from the mean for invariant sites
        Ho.adjSD <-
            sqrt((
                mapply(function(x, Mean)
                    sum((x - Mean) ^ 2, na.rm = TRUE), Ho.loc, Mean = Ho.adj) +
                  n.invariant * Ho.adj ^
                    2
            ) / (n_loc +
                     n.invariant - 1))
        
        ind.count <- function(x) {
            # the loci that are completely missing
            loci.na <-
                which(colSums(is.na(as.matrix(x))) == nrow(as.matrix(x)))
            # the number of samples in the matrix the number of non-genotyped
            # samples remove the loci that are completely missing
            if (length(loci.na) > 0) {
                nind <-
                    mean(nrow(as.matrix(x)) - 
                           colSums(is.na(as.matrix(x)))[-loci.na])
                # the number of samples in the matrix the number of
                # non-genotyped samples
            } else {
                nind <- mean(nrow(as.matrix(x)) - colSums(is.na(as.matrix(x))))
            }
            
            return(nind)
        }
        
        n_ind <- sapply(sgl, ind.count)
        
        ##########
        
        # EXPECTED HETEROZYGOSITY
        if (verbose >= 2) {
            cat(report("  Calculating Expected Heterozygosities\n\n"))
        }
        
        Hexp <- array(NA, length(sgl))
        Hexp.adj <- array(NA, length(sgl))
        
        ### CP ###
        HexpSD <- array(NA, length(sgl))
        uHexp <- array(NA, length(sgl))
        uHexpSD <- array(NA, length(sgl))
        Hexp.adjSD <- array(NA, length(sgl))
        ##########
        
        FIS <- array(NA, length(sgl))
        FISSD <- array(NA, length(sgl))
        
        # For each population
        for (i in 1:length(sgl)) {
            gl <- sgl[[i]]
            gl <- utils.recalc.freqhomref(gl, verbose = 0)
            gl <- utils.recalc.freqhomsnp(gl, verbose = 0)
            gl <- utils.recalc.freqhets(gl, verbose = 0)
            p <- gl@other$loc.metrics$FreqHomRef
            q <- gl@other$loc.metrics$FreqHomSnp
            hets <- gl@other$loc.metrics$FreqHets
            p <- (2 * p + hets) / 2
            q <- (2 * q + hets) / 2
            H <- 1 - (p ^ 2 + q ^ 2)
            
            ### CP ### Unbiased He (i.e. corrected for sample size) hard
            # coded for diploid
            uH <-
                (2 * as.numeric(n_ind[i]) / (2 * as.numeric(n_ind[i]) - 1)) * H
            ##########
            
            Hexp[i] <- mean(H, na.rm = T)
            ### CP ###
            uHexp[i] <- mean(uH, na.rm = T)
            Hexp.adj[i] <-
                Hexp[i] * n_loc[i] / (n_loc[i] + n.invariant)
            HexpSD[i] <- sd(H, na.rm = T)
            uHexpSD[i] <- sd(uH, na.rm = T)
            Hexp.adjSD[i] <-
                sqrt((sum((
                    H - Hexp.adj[i]
                ) ^ 2, na.rm = TRUE) + n.invariant * Hexp.adj[i] ^ 2) / 
                  (n_loc[i] + n.invariant - 1))
            ##########
            FIS_temp <- 1 - (mean(unlist(Ho.loc[i]),na.rm = T) / 
                               mean(uH,na.rm = T))
            FIS[i] <- mean(FIS_temp, na.rm = T)
            #FISSD[i] <- sd(FIS_temp, na.rm = T)
        }
        
        ### CP ###
        df <-
            data.frame(
                pop = popNames(x),
                nInd = round(n_ind,6),
                nLoc = n_loc,
                nLoc.adj = n_loc / (n_loc + n.invariant),
                polyLoc = poly_loc ,
                monoLoc = mono_loc , 
                all_NALoc = all_na_loc, 
                Ho = as.numeric(Ho),
                HoSD = HoSD,
                Ho.adj = as.numeric(Ho.adj),
                Ho.adjSD = Ho.adjSD,
                He = round(Hexp, 6),
                HeSD = round(HexpSD, 6),
                uHe = round(uHexp, 6),
                uHeSD = round(uHexpSD, 6),
                He.adj = round(Hexp.adj, 8),
                He.adjSD = round(Hexp.adjSD, 8),
                FIS = FIS
                # FISSD = FISSD
            )
        ##########
        
        if (plot.out) {
            value <- color <- variable <- He.adj <- NULL
            # printing plots and reports assigning colors to populations
            if (is(plot_colors_pop, "function")) {
                colors_pops <- plot_colors_pop(length(levels(pop(x))))
            }
            
            if (!is(plot_colors_pop,"function")) {
                colors_pops <- plot_colors_pop
            }
            
            if (n.invariant == 0) {
                df.ordered <- df
                df.ordered$color <- colors_pops
                df.ordered <- df.ordered[order(df.ordered$Ho),]
                df.ordered$pop <-
                    factor(df.ordered$pop, levels = df.ordered$pop)
                df.ordered <-
                    df.ordered[, c("pop", "nInd", "Ho", "uHe", "FIS", "color")]
                df.ordered <-
                    reshape2::melt(df.ordered, id = c("pop", "color", "nInd"))
                
                colors_pops_plot <-
                    df.ordered[, c("pop", "color", "variable", "value")]
                colors_pops_plot <-
                    colors_pops_plot[colors_pops_plot$variable == "Ho", ]
                colors_pops_plot <-
               colors_pops_plot[order(as.character(colors_pops_plot$value)), ]
                
                p3 <-
                    ggplot(df.ordered,
                           aes(x = pop,
                               y = value)) + geom_bar(
                                   position = "dodge2",
                                   stat = "identity",
                                   color = "black",
                                   fill = rep(colors_pops_plot$color, each =
                                                  3)
                               ) +
                    scale_x_discrete(labels = paste(
                        df.ordered$pop,
                        round(df.ordered$nInd,
                              0),
                        sep = " | "
                    )) +
                    geom_text(
                        label = rep(c("Ho", "uHe", "FIS"), length(unique(
                            df.ordered$pop
                        ))),
                        position = position_dodge2(width = 0.9),
                        stat = "identity",
                        vjust = -0.50,
                        size = 4,
                        inherit.aes = TRUE, fontface = "bold"
                    ) +
                    
                    plot_theme +
                    theme(
                        axis.ticks.x = element_blank(),
                        axis.text.x = element_text(
                            angle = 90,
                            hjust = 1,
                            face = "bold",
                            size = 12
                        ),
                        axis.title.x = element_blank(),
                        axis.ticks.y = element_blank(),
                        axis.title.y = element_blank(),
                        legend.position = "none"
                    ) + labs(fill = "Population") +
                    ggtitle("Heterozygosities and FIS by Population")
            } else {
                df.ordered <- df
                df.ordered$color <- colors_pops
                df.ordered <-
                    df.ordered[order(df.ordered$Ho.adj),]
                df.ordered$pop <-
                    factor(df.ordered$pop, levels = df.ordered$pop)
                p1 <-
                    ggplot(df.ordered, aes(
                        x = pop,
                        y = Ho.adj,
                        fill = pop
                    )) + geom_bar(position = "dodge",
                                  stat = "identity",
                                  color = "black") +
                    scale_fill_manual(values = df.ordered$color) + 
                  scale_x_discrete(labels = paste(
                        df.ordered$pop,
                        round(df.ordered$nInd, 0),
                        sep = " | "
                    )) + plot_theme + theme(
                        axis.ticks.x = element_blank(),
                        axis.text.x = element_blank(),
                        axis.title.x = element_blank(),
                        axis.ticks.y = element_blank(),
                        axis.title.y = element_blank(),
                        legend.position = "none"
                    ) + 
                  labs(fill = "Population") + 
                  ggtitle("Adjusted Observed Heterozygosity by Population")
                
                p2 <-
                    ggplot(df.ordered, aes(
                        x = pop,
                        y = He.adj,
                        fill = pop
                    )) + geom_bar(position = "dodge",
                                  stat = "identity",
                                  color = "black") +
                    scale_fill_manual(values = df.ordered$color) + 
                  scale_x_discrete(labels = paste(
                        df.ordered$pop,
                        round(df.ordered$nInd, 0),
                        sep = " | "
                    )) + plot_theme + theme(
                        axis.ticks.x = element_blank(),
                        axis.text.x = element_text(
                            angle = 90,
                            hjust = 1,
                            face = "bold",
                            size = 12
                        ),
                        axis.title.x = element_blank(),
                        axis.ticks.y = element_blank(),
                        axis.title.y = element_blank(),
                        legend.position = "none"
                    ) +
                    labs(fill = "Population") + 
                  ggtitle("Adjusted Expected Heterozygosity by Population")
                
                p3 <- (p1 / p2)
            }
        }
        
        # OUTPUT REPORT
        if (verbose >= 3) {
cat("  Reporting Heterozygosity by Population\n")
cat("\n  No. of loci =", nLoc(x), "\n")
cat("  No. of individuals =", nInd(x), "\n")
cat("  No. of populations =", nPop(x), "\n")
cat("    Minimum Observed Heterozygosity: ", round(min(df$Ho, na.rm = TRUE), 6))
            if (n.invariant > 0) {
          cat("   [Corrected:", round(min(df$Ho.adj, na.rm = TRUE), 6), "]\n")
            } else {
                cat("\n")
            }
cat("    Maximum Observed Heterozygosity: ", round(max(df$Ho, na.rm = TRUE), 6))
            if (n.invariant > 0) {
        cat("   [Corrected:", round(max(df$Ho.adj, na.rm = TRUE), 6), "]\n")
            } else {
                cat("\n")
            }
            cat("    Average Observed Heterozygosity: ",
                round(mean(df$Ho, na.rm = TRUE), 6))
            if (n.invariant > 0) {
      cat("   [Corrected:", round(mean(df$Ho.adj, na.rm = TRUE), 6), "]\n\n")
            } else {
                cat("\n\n")
            }
            cat("    Minimum Unbiased Expected Heterozygosity: ",
                round(min(df$uHe, na.rm = TRUE), 6))
            if (n.invariant > 0) {
           cat("   [Corrected:", round(min(df$He.adj, na.rm = TRUE), 6), "]\n")
            } else {
                cat("\n")
            }
            cat("    Maximum Unbiased Expected Heterozygosity: ",
                round(max(df$uHe, na.rm = TRUE), 6))
            if (n.invariant > 0) {
         cat("   [Corrected:", round(max(df$He.adj, na.rm = TRUE), 6), "]\n")
            } else {
                cat("\n")
            }
            cat("    Average Unbiased Expected Heterozygosity: ",
                round(mean(df$uHe, na.rm = TRUE), 6))
            if (n.invariant > 0) {
      cat("   [Corrected:", round(mean(df$He.adj, na.rm = TRUE), 6), "]\n\n")
            } else {
                cat("\n\n")
            }
            
            if (n.invariant > 0) {
                cat(
                    "  Average correction factor for invariant loci =",
                    mean(n_loc / (n_loc + n.invariant), na.rm = TRUE),
                    "\n"
                )
            } else {
                cat(
     "  Heterozygosity estimates not corrected for uncalled invariant loci\n"
                )
            }
        }
        
        # PRINTING OUTPUTS
        if (plot.out) {
            suppressWarnings(print(p3))
        }
        if (verbose >= 2) {
            if (n.invariant > 0) {
                print(df)
            } else {
                print(df[, c(
                    "pop",
                    "nInd",
                    "nLoc",
                    "polyLoc",
                    "monoLoc", 
                    "all_NALoc", 
                    "Ho",
                    "HoSD",
                    "He",
                    "HeSD",
                    "uHe",
                    "uHeSD",
                    "FIS"
                    # ,
                    # "FISSD"
                )], row.names = FALSE)
            }
        }
    }
    
    ########### FOR METHOD BASED ON INDIVIDUAL
    
    if (method == "ind") {
        if (verbose >= 2) {
            cat(report(
                "  Calculating observed heterozygosity for individuals\n"
            ))
            cat(report(
         "  Note: No adjustment for invariant loci (n.invariant set to 0)\n"
            ))
        }
        # Convert to matrix
        m <- as.matrix(x)
        
        # For each individual determine counts of hets, homs and NAs
        c.na <- array(NA, nInd(x))
        c.hets <- array(NA, nInd(x))
        c.hom0 <- array(NA, nInd(x))
        c.hom2 <- array(NA, nInd(x))
        for (i in 1:nInd(x)) {
            c.na[i] <- sum(is.na(m[i,]))
            c.hets[i] <-
                sum(m[i,] == 1, na.rm = TRUE) / (nLoc(x) - c.na[i])
            c.hom0[i] <-
                sum(m[i,] == 0, na.rm = TRUE) / (nLoc(x) - c.na[i])
            c.hom2[i] <-
                sum(m[i,] == 2, na.rm = TRUE) / (nLoc(x) - c.na[i])
        }
        
        # Join the sample sizes with the heterozygosities
        df <-
            cbind.data.frame(x@ind.names, c.hets, c.hom0, c.hom2)
        names(df) <-
            c("ind.name", "Ho", "f.hom.ref", "f.hom.alt")
        
        # Boxplot
        if (plot.out) {
            upper <- ceiling(max(df$Ho) * 10) / 10
            p1 <-
                ggplot(df, aes(y = Ho)) + 
        geom_boxplot(color = plot_colors_ind[1], fill = plot_colors_ind[2]) + 
              coord_flip() + 
              plot_theme +
                xlim(range = c(-1, 1)) +
              ylim(0, upper) + 
              ylab(" ") + 
         theme(axis.text.y = element_blank(),axis.ticks.y = element_blank()) +
                ggtitle("Observed Heterozygosity by Individual")
            
            # Histogram
            p2 <-
                ggplot(df, aes(x = Ho)) +
geom_histogram(bins =25,color = plot_colors_ind[1],fill =plot_colors_ind[2]) +
              coord_cartesian(xlim = c(0, upper)) +
              xlab("Observed heterozygosity") + 
              ylab("Count") +
              plot_theme
        }
        
        outliers_temp <-
            ggplot_build(p1)$data[[1]]$outliers[[1]]
        outliers <-
            data.frame(ID = as.character(df$ind.name[df$Ho %in% outliers_temp]), 
                       Ho = outliers_temp)
        
        # OUTPUT REPORT
        if (verbose >= 3) {
            cat("Reporting Heterozygosity by Individual\n")
            cat("No. of loci =", nLoc(x), "\n")
            cat("No. of individuals =", nInd(x), "\n")
            cat("  Minimum Observed Heterozygosity: ",
                round(min(df$Ho), 6),
                "\n")
            cat("  Maximum Observed Heterozygosity: ",
                round(max(df$Ho), 6),
                "\n")
            cat("  Average Observed Heterozygosity: ",
                round(mean(df$Ho), 6),
                "\n\n")
            cat("  Results returned as a dataframe\n\n")
            if (nrow(outliers) == 0) {
                cat("  No outliers detected\n\n")
            } else {
                cat("  Outliers detected\n")
                print(outliers)
                cat("\n")
            }
        }
        
        # PRINTING OUTPUTS
        if (plot.out) {
            p3 <- (p1 / p2) + plot_layout(heights = c(1, 4))
            print(p3)
        }
        if (verbose >= 2) {
            print(df, row.names = FALSE)
        }
    }
    
    # SAVE INTERMEDIATES TO TEMPDIR
    if (save2tmp) {
        # creating temp file names
        if (plot.out) {
            temp_plot <- tempfile(pattern = "Plot_")
        }
        temp_table <- tempfile(pattern = "Table_")
        match_call <-
            paste0(names(match.call()),
                   "_",
                   as.character(match.call()),
                   collapse = "_")
        # saving to tempdir
        if (plot.out) {
            saveRDS(list(match_call, p3), file = temp_plot)
            if (verbose >= 2) {
                cat(report("  Saving the ggplot to session tempfile\n"))
            }
        }
        saveRDS(list(match_call, df), file = temp_table)
        if (verbose >= 2) {
            cat(report("  Saving tabulation to session tempfile\n"))
            cat(
                report(
                    "  NOTE: Retrieve output files from tempdir using
                    gl.list.reports() and gl.print.reports()\n"
                )
            )
        }
    }
    
    if (verbose >= 3) {
        cat(report("  Returning a dataframe with heterozygosity values\n"))
    }
    
    # FLAG SCRIPT END
    if (verbose >= 1) {
        cat(report("Completed:", funname, "\n"))
    }
    
    # RETURN
    
    invisible(df)
}
Any scripts or data that you put into this service are public.
dartR documentation built on June 8, 2023, 6:48 a.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
dartR
Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

R/gl.report.heterozygosity.r
In dartR: Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

Defines functions gl.report.heterozygosity

Documented in gl.report.heterozygosity

Try the dartR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

dartR Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

R/gl.report.heterozygosity.r In dartR: Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

Defines functions gl.report.heterozygosity

Documented in gl.report.heterozygosity

Try the dartR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

dartR
Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis

R/gl.report.heterozygosity.r
In dartR: Importing and Analysing 'SNP' and 'Silicodart' Data Generated by Genome-Wide Restriction Fragment Analysis