R/cellscape.R
In cellscape: Explores single cell copy number profiles in the context of a single cell tree

Documented in cellscape cellscapeOutput checkAlpha checkClonalPrev checkCloneColours checkGtypePositioning checkMinDims checkPerts checkRequiredInputs checkTreeEdges dfs_tree findMode getChromBounds getChromBoxInfo getCNVHeatmapForEachSC getEmptyGrid getGenomeLength getMutationsData getMutOrder getNBPPerPixel getTargetedHeatmapForEachSC processUserData renderCnvTree replaceSpaces

#' CellScape
#'
#' \code{cellscape} explores single cell copy number profiles in the 
#'     context of a single cell phylogeny.
#' 
#' Interactive components:
#'   \enumerate{
#' 
#'     \item Mouseover any single cell in the phylogeny to view its 
#'     corresponding genomic profile in the heatmap, and vice versa.
#' 
#'     \item Mouseover any part of the heatmap to view the CNV or VAF 
#'     value for that copy number segment or mutation site, respectively. 
#' 
#'     \item Mouseover any branch of the phylogeny to view downstream 
#'     single cells, both in the phylogeny and heatmap. 
#' 
#'     \item Mouseover any clone to view its corresponding single cells 
#'     in the phylogeny and heatmap. 
#' 
#'     \item Click any node in the phylogeny to flip the order of its 
#'     descendant branches. 
#' 
#'     \item Use the selection tool in the tool bar to select single cell 
#'     genomic profiles and view their corresponding single cells in the 
#'     phylogeny. 
#' 
#'     \item Use the tree trimming tool in the tool bar to remove any
#'      branch of the phylogeny by clicking it. 
#' 
#'     \item Use the switch view tool in the tool bar to change the 
#'     phylogeny view from force-directed to unidirectional, and vice 
#'     versa. 
#' 
#'     \item Use the re-root phylogeny tool to root the phylogeny at a 
#'     clicked node. 
#' 
#'     \item Use the flip branch tool to vertically rotate any branch by 
#'     clicking its root node.
#' 
#'     \item If present, use the scale tree/graph tool in the tool bar to 
#'     scale the phylogeny by the provided edge distances. 
#' 
#'     \item If time-series information is present such that the TimeScape 
#'     is displayed below the CellScape, clones and time points are 
#'     interactively linked in both views on mouseover. 
#' 
#'     \item Click the download buttons to download a PNG or SVG of the view. 
#'
#'   }
#'
#' Note: 
#' 
#' See TimeScape repo (https://bitbucket.org/MO_BCCRC/timescape) for more 
#'     information about TimeScape. 
#'
#'   
#' @import htmlwidgets
#'
#' @param cnv_data \code{data.frame} 
#'     (Required if not providing mut_data nor mut_data_matrix) 
#'     Single cell copy number segments data. Note that every single cell id
#'     must be present in the tree_edges data frame. Required columns are:
#'     \describe{
#'
#'       \item{single_cell_id:}{\code{character()} single cell id.}
#' 
#'       \item{chr:}{\code{character()} chromosome number.}
#' 
#'       \item{start:}{\code{numeric()} start position.}
#'
#'       \item{end:}{\code{numeric()} end position.}
#'
#'       \item{copy_number:}{\code{numeric()} copy number state.}
#'
#'     }
#' @param mut_data \code{data.frame} 
#'     (Required if not providing cnv_data nor mut_data_matrix) 
#'     Single cell targeted mutation data frame. Note that every single cell id 
#'     must be present in the tree_edges data frame. Required columns are:
#'     \describe{
#'
#'       \item{single_cell_id:}{\code{character()} single cell id.}
#' 
#'       \item{chr:}{\code{character()} chromosome number.}
#' 
#'       \item{coord:}{\code{numeric()} genomic coordinate.}
#'
#'       \item{VAF:}{\code{numeric()} variant allele frequency [0, 1].}
#'
#'     }
#' @param mut_data_matrix \code{matrix} 
#'     (Required if not providing cnv_data nor mut_data) 
#'     Single cell targeted mutation matrix. Rows are single cell IDs, 
#'     columns are mutations. Rows and columns must be named, column names 
#'     in the format "<chromosome>:<coordinate>". 
#'     Note that the order of these rows and columns will not be 
#'     preserved, unless mutation order is the same as that specified in the 
#'     mut_order parameter. Also note that every single cell id must be  
#'     present in the tree_edges data frame. 
#' @param mut_order \code{vector} (Optional) Mutation order for targeted 
#'     mutation heatmap (each mutation should consist of a string in the 
#'     form "chrom:coord"). Default will use a clustering function to 
#'     determine mutation order.
#' @param tree_edges \code{data.frame} Edges for the single cell phylogenetic 
#'     tree. Required columns are:
#'     \describe{
#'
#'       \item{source:}{\code{character()} edge source (single cell id).}
#' 
#'       \item{target:}{\code{character()} edge target (single cell id).}
#' 
#'     }
#'     Optional columns are:
#'     \describe{
#'
#'       \item{dist:}{\code{numeric()} edge distance.}
#' 
#'     }
#' @param gtype_tree_edges \code{data.frame} (Required for TimeScape) Genotype 
#'     tree edges of a rooted tree. Required columns are:
#'     \describe{
#'
#'       \item{source:}{\code{character()} source node id.}
#' 
#'       \item{target:}{\code{character()} target node id.}
#' 
#'     }
#' @param sc_annot \code{data.frame} (Required for TimeScape) Annotations 
#'     (genotype and sample id) for each single cell. Required columns are:
#'     \describe{
#'
#'       \item{single_cell_id:}{\code{character()} single cell id.}
#' 
#'       \item{genotype:}{\code{character()} genotype assignment.}
#' 
#'     }
#'     Optional columns are:
#'     \describe{
#'
#'       \item{timepoint:}{\code{character()} id of the sampled time point. 
#'           Note: time points will be ordered alphabetically. }
#' 
#'     }
#' @param clone_colours \code{data.frame} (Optional) Clone ids and their 
#'     corresponding colours (in hex format). Required columns are:
#'     \describe{
#'
#'       \item{clone_id:}{\code{character()} clone id.}
#' 
#'       \item{colour:}{\code{character()} the corresponding Hex colour 
#'          for each clone id.}
#'
#'     }
#' @param timepoint_title \code{character()} (Optional) Legend title for 
#'     timepoint groups. Default is "Timepoint".
#' @param clone_title \code{character()} (Optional) Legend title for clones. 
#'     Default is "Clone".
#' @param xaxis_title \code{character()} (Optional) For TimeScape - x-axis title. 
#'     Default is "Time Point".
#' @param yaxis_title \code{character()} (Optional) For TimeScape - y-axis title. 
#'     Default is "Clonal Prevalence".
#' @param phylogeny_title \code{character()} (Optional) For TimeScape - legend 
#'     phylogeny title. Default is "Clonal Phylogeny".
#' @param value_type \code{character()} (Optional) The type of value plotted in 
#'     heatmap - will affect legend and heatmap tooltips. Default is "VAF" for 
#'     mutation data, and "CNV" for copy number data.
#' @param node_type \code{character()} (Optional) The type of node plotted in 
#'     single cell phylogeny - will affect phylogeny tooltips. Default is "Cell".
#' @param display_node_ids \code{logical()} (Optional) Whether or not to display 
#'     the single cell ID within the tree nodes. Default is FALSE.
#' @param prop_of_clone_threshold \code{numeric()} (Optional) Used for the 
#'     ordering of targeted mutations. The minimum proportion of a clone to have 
#'     a mutation in order to consider the mutation as present within that clone. 
#'     Default is 0.2.
#' @param vaf_threshold \code{numeric()} (Optional) Used for the ordering of 
#'     targeted mutations. The minimum variant allele frequency for a mutation to 
#'     be considered as present within a single cell. Default is 0.05.
#' @param show_warnings \code{logical()} (Optional) Whether or not to show any 
#'     warnings. Default is TRUE.
#' @param width \code{numeric()} (Optional) Width of the plot.
#' @param height \code{numeric()} (Optional) Height of the plot.
#'
#' @export
#' @examples
#'
#'
#' library("cellscape")
#'
#' 
#' # EXAMPLE 1 - TARGETED MUTATION DATA
#'
#' # single cell tree edges
#' tree_edges <- read.csv(system.file("extdata", "targeted_tree_edges.csv", 
#'     package = "cellscape"))
#'
#' # targeted mutations
#' targeted_data <- read.csv(system.file("extdata", "targeted_muts.csv", 
#'     package = "cellscape"))
#'
#' # genotype tree edges
#' gtype_tree_edges <- data.frame("source"=c("Ancestral", "Ancestral", "B",
#'     "C", "D"), "target"=c("A", "B", "C", "D", "E"))
#'
#' # annotations
#' sc_annot <- read.csv(system.file("extdata", "targeted_annots.csv", 
#'     package = "cellscape"))
#'
#' # mutation order
#' mut_order <- scan(system.file("extdata", "targeted_mut_order.txt", 
#'     package = "cellscape"), what=character())
#'
#' # run cellscape
#' cellscape(mut_data=targeted_data, tree_edges=tree_edges, sc_annot = 
#'     sc_annot, gtype_tree_edges=gtype_tree_edges, mut_order=mut_order)
#'
#' 
#' # EXAMPLE 2 - COPY NUMBER DATA
#'
#' # single cell tree edges
#' tree_edges <- read.csv(system.file("extdata", "cnv_tree_edges.csv", 
#'     package = "cellscape"))
#'
#' # cnv segments data
#' cnv_data <- read.csv(system.file("extdata", "cnv_data.csv", package = 
#'     "cellscape"))
#'
#' # annotations
#' sc_annot <- read.csv(system.file("extdata", "cnv_annots.tsv", package = 
#'     "cellscape"), sep="\t")
#'
#' # custom clone colours
#' clone_colours <- data.frame( clone_id = c("1","2","3"), 
#'     colour = c("7fc97f", "beaed4", "fdc086"))
#'
#' # run cellscape
#' cellscape(cnv_data=cnv_data, tree_edges=tree_edges, sc_annot=sc_annot, 
#'     width=800, height=475, show_warnings=FALSE, 
#'     clone_colours = clone_colours)
cellscape <- function(cnv_data = NULL, 
                    mut_data = NULL, 
                    mut_data_matrix = NULL, 
                    mut_order = NULL,
                    tree_edges, 
                    gtype_tree_edges = NULL,
                    sc_annot = NULL, 
                    clone_colours = "NA",
                    timepoint_title = "Timepoint",
                    clone_title = "Clone",
                    xaxis_title = "Time Point",
                    yaxis_title = "Clonal Prevalence",
                    phylogeny_title = "Clonal Phylogeny",
                    value_type = NULL,
                    node_type = "Cell",
                    display_node_ids = FALSE, 
                    prop_of_clone_threshold = 0.2,
                    vaf_threshold = 0.05,
                    show_warnings = TRUE,
                    width = 900, 
                    height = 800) {

  # CHECK REQUIRED INPUTS ARE PRESENT 

  if (is.null(cnv_data) && is.null(mut_data) && is.null(mut_data_matrix)) {
    stop("User must provide either copy number data (parameter cnv_data)",
      " or mutation data (parameter mut_data or mut_data_matrix).")
  }
  if (!is.null(cnv_data) && (!is.null(mut_data) || !is.null(mut_data))) {
    stop("User can only provide copy number (parameter cnv_data) OR targeted mutations",
      " data (parameter mut_data or mut_data_matrix), not both.")
  }
  if (missing(tree_edges)) {
    stop("User must provide tree edge data (parameter tree_edges).")
  }

  # IF mut_data_matrix provided, convert it into mut_data data frame
  if (!missing(mut_data_matrix)) {
    if (is.null(rownames(mut_data_matrix))) {
      stop("mut_data_matrix must have row names.")
    }
    if (is.null(colnames(mut_data_matrix))) {
      stop("mut_data_matrix must have column names.")
    }
    mut_data <- setNames(melt(mut_data_matrix), c('rows', 'vars', 'values'))
    colnames(mut_data) <- c("single_cell_id", "mut", "VAF")
    mut_data$chr <- sapply(mut_data$mut, function(mut) {
        mut <- as.character(mut)
        if (!grepl(":", mut)) {
          stop("Mutation in mut_data_matrix row name is not in the correct format '<chromosome>:<coordinate>'.")
        }
        return (strsplit(mut, split=":")[[1]][1])
    })
    mut_data$coord <- sapply(mut_data$mut, function(mut) {
        mut <- as.character(mut)
        return (strsplit(mut, split=":")[[1]][2])
    })
    mut_data$mut <- NULL
  }

  # heatmap width (pixels)
  heatmapWidth <- (width/2) 

  # CLONE COLOURS

  if (is.data.frame(clone_colours)) {

    # ensure column names are correct
    if (!("clone_id" %in% colnames(clone_colours)) ||
        !("colour" %in% colnames(clone_colours))) {
      stop("Node colour data frame must have the following column names: ", 
          "\"clone_id\", \"colour\"")
    }
  }

  # GENOTYPE TREE EDGES
  if (!missing(gtype_tree_edges)) {

    # check genotype tree inputs

    gtype_tree_edges <- checkTreeEdges(gtype_tree_edges)

    # check that all genotypes in the annotations data are present in the genotype tree
    if (!missing(sc_annot)) {
      gtype_tree_gtypes <- unique(c(gtype_tree_edges$source, gtype_tree_edges$target))
      sc_annot_gtypes <- unique(sc_annot$genotype)
      gtypes_missing_from_gtype_tree <- setdiff(sc_annot_gtypes, gtype_tree_gtypes)
      if (length(gtypes_missing_from_gtype_tree) > 0) {
        stop("The following clone ID(s) are present in the single cell annotations data but ",
          "are missing from the genotype tree edges data: ",
          paste(gtypes_missing_from_gtype_tree, collapse=", "), ".")
      }
    }

    # get root of tree

    sources <- unique(gtype_tree_edges$source)
    targets <- unique(gtype_tree_edges$target)
    sources_for_iteration <- sources # because we will be changing the sources array over time
    for (i in 1:length(sources_for_iteration)) {
      cur_source <- sources_for_iteration[i]

      # if the source is a target, remove it from the sources list
      if (cur_source %in% targets) {
        sources <- sources[sources != cur_source]
      }
    }
    cur_root <- sources[1]

    # GET GENOTYPE TREE DFS

    dfs_gtype_tree <- dfs_tree(gtype_tree_edges, cur_root, c())
  }
  else {
    dfs_gtype_tree <- NULL
  }

  # VALUE TITLE (title for heatmap value legend)

  # not specified by user
  if (is.null(value_type)) {
    if (missing(mut_data)) {
      value_type <- "CNV"
    }
    else {
      value_type <- "VAF"
    }
  }
  # specified by user
  else {
    value_type <- as.character(value_type)
  }

  # CNV DATA

  # CNV data is provided
  if (missing(mut_data)) {

    # set heatmap type
    heatmap_type <- "cnv"

    # check it's a data frame
    if (!is.data.frame(cnv_data)) {
      stop("CNV data (parameter cnv_data) must be a data frame.")
    }

    # ensure column names are correct
    if (!("single_cell_id" %in% colnames(cnv_data)) ||
        !("chr" %in% colnames(cnv_data)) ||
        !("start" %in% colnames(cnv_data)) ||
        !("end" %in% colnames(cnv_data)) ||
        !("copy_number" %in% colnames(cnv_data))) {
      stop("CNV data frame (parameter cnv_data) must have the following column names: ", 
          "\"single_cell_id\", \"chr\", \"start\", \"end\", \"copy_number\"")
    }

    # ensure data is of the correct type
    cnv_data$single_cell_id <- as.character(cnv_data$single_cell_id)
    cnv_data$chr <- as.character(cnv_data$chr)
    cnv_data$start <- as.numeric(as.character(cnv_data$start))
    cnv_data$end <- as.numeric(as.character(cnv_data$end))
    cnv_data$copy_number <- as.numeric(as.character(cnv_data$copy_number))

    # change "23" to "X", "24" to "Y"
    cnv_data$chr[which(cnv_data$chr == "23")] <- "X"
    cnv_data$chr[which(cnv_data$chr == "24")] <- "Y"

    # determine whether the data is discrete or continuous
    cnvs_without_nas <- na.omit(cnv_data$copy_number)
    continuous_cnv <- !all(cnvs_without_nas == floor(cnvs_without_nas))

    # check that the number of single cells does not exceed the height of the plot
    n_scs <- length(unique(cnv_data$single_cell_id))
    if ((height - 45) < n_scs) { # - 45 for top bar height (30) and space between top bar and main view (15)
      stop("The number of single cells (",n_scs,") cannot exceed the plot height minus 45px (",
        (height - 45),"). Either reduce the number of cells, or increase the plot height.")
    }

    # get chromosomes, chromosome bounds (min & max bp), genome length
    chroms <- gtools::mixedsort(unique(cnv_data$chr))
    chrom_bounds <- getChromBounds(chroms, cnv_data) 
    genome_length <- getGenomeLength(chrom_bounds)

    # get cnv heatmap information for each cell
    n_bp_per_pixel <- getNBPPerPixel(heatmapWidth, chrom_bounds, genome_length) # number bps per pixel
    heatmap_info <- getCNVHeatmapForEachSC(cnv_data, chrom_bounds, n_bp_per_pixel)
    
    # get chromosome box information (chromosome legend)
    chrom_boxes <- getChromBoxInfo(chrom_bounds, n_bp_per_pixel)
  }

  # TARGETED MUTATIONS DATA

  # targeted mutations data is provided
  if (missing(cnv_data)) {

    # set heatmap type
    heatmap_type <- "targeted"

    # check it's a data frame
    if (!is.data.frame(mut_data)) {
      stop("Targeted mutations data (parameter mut_data) must be a data frame.")
    }

    # ensure column names are correct
    if (!("single_cell_id" %in% colnames(mut_data)) ||
        !("chr" %in% colnames(mut_data)) ||
        !("coord" %in% colnames(mut_data)) ||
        !("VAF" %in% colnames(mut_data))) {
      stop("Targeted mutations data frame must have the following column names: ", 
          "\"single_cell_id\", \"chr\", \"coord\", \"VAF\"")
    }

    # ensure data is of the correct type
    mut_data$single_cell_id <- as.character(mut_data$single_cell_id)
    mut_data$chr <- as.character(mut_data$chr)
    mut_data$coord <- as.numeric(as.character(mut_data$coord))
    mut_data$VAF <- as.numeric(as.character(mut_data$VAF))

    # change "23" to "X", "24" to "Y"
    mut_data$chr[which(mut_data$chr == "23")] <- "X"
    mut_data$chr[which(mut_data$chr == "24")] <- "Y"

    # get site name for each mutation
    mut_data$site <- paste(trimws(mut_data$chr), trimws(mut_data$coord), sep=":")

    # ensure VAF is between 0 and 1
    if (length(which(mut_data$VAF < 0)) > 0) {
      stop("You have entered mutation data with VAF < 0. Only enter data between 0 and 1 (NA is ok).")
    }
    if (length(which(mut_data$VAF > 1)) > 0) {
      stop("You have entered mutation data with VAF > 1. Only enter data between 0 and 1 (NA is ok).")
    }

    # set continuous cnv to false (we're now using VAF data that is always continuous)
    continuous_cnv <- FALSE

    # get chromosomes
    chroms <- gtools::mixedsort(unique(mut_data$chr))

    # set chromosome bounds, genome length and chromosome box info to NULL (not needed for mutation data)
    chrom_bounds <- NULL
    genome_length <- NULL
    chrom_boxes <- NULL

    # if the user has provided the mutation order, check it includes all existing mutations
    if (!is.null(mut_order)) {
      sites <- unique(mut_data$site)
      sites_missing_from_mut_order <- setdiff(sites, mut_order)
      if (length(sites_missing_from_mut_order) > 0) {
        stop("The following mutation(s) are present in the targeted mutation data but ",
            "are missing from the mutation order data: ",
            paste(sites_missing_from_mut_order, collapse=", "), 
            ". All mutation sites must be present in the mutation order data.")
      }
      sites_extra_in_mut_order <- setdiff(mut_order, sites)
      if (length(sites_extra_in_mut_order) > 0) {
        if (show_warnings) {
          print(paste("WARNING: The following mutation(s) are present in the mutation order data but ",
              "are missing from the targeted mutation data: ",
              paste(sites_extra_in_mut_order, collapse=", "), ". They will have no effect on the visualization.", sep=""))
        }

        # remove these excess sites from the mut_order data
        mut_order <- setdiff(sites, sites_extra_in_mut_order)
      }
    }
    # if user has NOT provided mutation order BUT
    # if there is genotype tree data and genotype annotations for the cells,
    # calculate mutation order using this information
    else if (!is.null(gtype_tree_edges) && 
          !is.null(sc_annot) && 
          ("genotype" %in% colnames(sc_annot))) {

      # GET MUTATION GENOTYPES IN PLOTTING ORDER (e.g. ABCD, ABC, AB, A, BCD, BC, B, CD, C, D)
      # ie. each mutation will be plotted according to which genotypes have the mutation, and the mutations
      # in genotypes A, B, C and D will be plotted first, followed by those only in A, B and C, followed by....

      mut_gtypes <- c()
      for (i in 1:length(dfs_gtype_tree)) {
        for (j in length(dfs_gtype_tree):1) {
          mut_gtypes <- append(mut_gtypes, paste(dfs_gtype_tree[i:j], collapse=""))
          if (i == j) {
            break
          }
          else { 
            if ((j-i) > 2) { # if there is more than one letter between i and j
              # add internal letters in all other non-linear combinations (but still in order)
              for (count in (j-i-2):1) { # for each length of internal letters to add
                combos <- combn(dfs_gtype_tree[(i+1):(j-1)], count)
                for (combo_i in 1:ncol(combos)) {
                  combo_i_pasted <- paste(combos[,combo_i], collapse="")
                  mut_gtypes <- append(mut_gtypes, paste(dfs_gtype_tree[i], combo_i_pasted, dfs_gtype_tree[j], sep=""))
                }
              }
            }
            # concatenate first and last genotype together
            if ((j-i) >= 2) {
              mut_gtypes <- append(mut_gtypes, paste(dfs_gtype_tree[i], dfs_gtype_tree[j], sep=""))
            }
          }
        }
      }


      # GET GENOTYPES FOR EACH TARGETED MUTATION 

      cur_mut_data <- mut_data

      # take care of NA and Inf VAF values
      cur_mut_data$VAF[which(is.na(cur_mut_data$VAF))] <- NA
      cur_mut_data$VAF[which(is.infinite(cur_mut_data$VAF))] <- NA

      # add genotypes to mutation data
      mut_data_w_gtypes <- merge(cur_mut_data, sc_annot, by="single_cell_id")

      # get average VAF for each [site X genotype] combination
      mut_data_grouped <- dplyr::group_by(mut_data_w_gtypes, site, genotype)
      mut_data_grouped_avg_VAF <- dplyr::summarise(mut_data_grouped, 
        avg_VAF=sum(VAF, na.rm=TRUE)/length(VAF),
        n = n(),
        n_gt = sum(VAF > vaf_threshold, na.rm=TRUE),
        p_gt = n_gt / n)
      mut_data_grouped_avg_VAF <- as.data.frame(mut_data_grouped_avg_VAF)
      # print("mut_data_grouped_avg_VAF")
      # print(mut_data_grouped_avg_VAF)

      # keep only those [site X genotype] combinations where the average vaf is greater than the threshold
      # and present in more than a certain percentage of cells with that genotype
      mut_data_grouped_avg_VAF <- mut_data_grouped_avg_VAF[which(mut_data_grouped_avg_VAF$avg_VAF > vaf_threshold), ]
      mut_data_grouped_avg_VAF <- mut_data_grouped_avg_VAF[which(mut_data_grouped_avg_VAF$p_gt > prop_of_clone_threshold), ]

      # for each mutation, paste the genotypes together
      site_gtype_split_by_site <- split(mut_data_grouped_avg_VAF , f = mut_data_grouped_avg_VAF$site)
      site_gtype_list <- lapply(site_gtype_split_by_site, function(site) { 
        cur_gtypes <- as.character(site$genotype)
        cur_sorted_gtypes <- cur_gtypes[order(match(cur_gtypes,dfs_gtype_tree))]
        collapsed_gtypes <- paste(cur_sorted_gtypes, collapse="")
        return(collapsed_gtypes)
      })
      site_gtype_df <- do.call(rbind.data.frame, site_gtype_list)
      colnames(site_gtype_df) <- c("gtypes")
      site_gtype_df$site <- rownames(site_gtype_df)

      # order mutation sites by average VAF 
      cur_data_grouped_by_site <- dplyr::group_by(cur_mut_data, site)
      mut_site_avg <- dplyr::summarise(cur_data_grouped_by_site, avg_VAF=sum(VAF, na.rm=TRUE)/length(VAF)) 
      site_gtype_df <- merge(site_gtype_df, mut_site_avg, by="site")
      site_gtype_df <- site_gtype_df[rev(order(site_gtype_df$avg_VAF)),]

      # ORDER MUTATIONS BY WHICH GENOTYPES THEY APPEAR IN

      site_gtype_df <- site_gtype_df[order(match(site_gtype_df$gtypes, mut_gtypes)),]
      mut_order <- site_gtype_df$site

      # print("site_gtype_df")
      # print(site_gtype_df)

      # append any mutations that have all low prevalence data, and thus not accounted for yet
      low_prev_muts <- setdiff(unique(cur_mut_data$site), mut_order)
      mut_order <- append(mut_order, low_prev_muts)
    }

    # no mutation order and no genotype tree info
    else {
      mut_order <- getMutOrder(mut_data)
    }
    
    # heatmap information for each cell
    heatmap_info <- getTargetedHeatmapForEachSC(mut_data, mut_order, heatmapWidth)
  }

  # TREE EDGE DATA

  # check it's a data frame
  if (!is.data.frame(tree_edges)) {
    stop("Tree edges data (parameter tree_edges) must be a data frame.")
  }

  # ensure column names are correct
  if (!("source" %in% colnames(tree_edges)) ||
      !("target" %in% colnames(tree_edges))) {
    stop("Tree edges data frame must have the following column names: ", 
        "\"source\", \"target\"")
  }

  # ensure data is of the correct type
  tree_edges$source <- as.character(tree_edges$source)
  tree_edges$target <- as.character(tree_edges$target)
  if ("dist" %in% colnames(tree_edges)) {
    tree_edges$dist <- as.numeric(as.character(tree_edges$dist))
    distances_provided <- TRUE

    # if there are negative distances, convert them to 0.01
    negative_dists <- tree_edges$dist[which(tree_edges$dist < 0)]
    if (length(negative_dists) > 0) {
      value_for_neg_dists <- 0.1
      if (show_warnings) {
        print(paste("WARNING: Negative distances found in tree edges data frame. Any negative distance will be ",
          "converted to ", value_for_neg_dists, ".", sep=""))
      }
      tree_edges$dist[which(tree_edges$dist < 0)] <- value_for_neg_dists
    }
  }
  else {
    tree_edges$dist <- NaN
    distances_provided <- FALSE
  }

  # list of tree nodes for d3 phylogenetic layout function
  unique_nodes <- unique(c(tree_edges$source, tree_edges$target))
  tree_nodes_for_layout <- data.frame(sc_id=unique_nodes, 
                                      index=(seq(1:length(unique_nodes)) - 1), 
                                      stringsAsFactors=FALSE)

  # list of tree edges for d3 phylogenetic layout function
  # note: for force-directed graph, we need the source/target to be the *index* of the tree node in the list of tree nodes
  tree_edges_for_layout <- data.frame(
    source=sapply(tree_edges$source, function(src) {return(which(tree_nodes_for_layout$sc_id == src) - 1) }),
    source_sc_id=tree_edges$source,
    target=sapply(tree_edges$target, function(trg) {return(which(tree_nodes_for_layout$sc_id == trg) - 1) }),
    target_sc_id=tree_edges$target,
    link_id=apply(tree_edges, 1, function(edge) { 
        return(paste("link_source_", edge["source"], "_target_", edge["target"], sep="")) 
      }),
    dist=tree_edges$dist,
    stringsAsFactors=FALSE)
  tree_edges_for_layout$source <- as.numeric(as.character(tree_edges_for_layout$source))
  tree_edges_for_layout$target <- as.numeric(as.character(tree_edges_for_layout$target))

  # get list of link ids
  link_ids <- tree_edges_for_layout$link_id

  # check for tree rootedness
  sources <- unique(tree_edges$source)
  targets <- unique(tree_edges$target)
  sources_for_iteration <- sources # because we will be changing the sources array over time
  for (i in 1:length(sources_for_iteration)) {
    cur_source <- sources_for_iteration[i]

    # if the source is a target, remove it from the sources list
    if (cur_source %in% targets) {
      sources <- sources[sources != cur_source]
    }
  }

  # if multiple roots are detected, throw error
  if (length(sources) > 1) {
    stop("Multiple roots detected in tree (",paste(sources,collapse=", "),
      ") - tree must have only one root.")
  }
  # otherwise, set the root
  else {
    root <- sources
  }

  # GET SINGLE CELLS THAT ARE IN THE TREE BUT DON'T HAVE ASSOCIATED HEATMAP DATA
  scs_in_hm <- names(heatmap_info) # single cells in heatmap
  scs_in_tree <- unique(c(tree_edges$source, tree_edges$target))
  scs_missing_from_hm <- setdiff(scs_in_tree, scs_in_hm)

  # ENSURE ALL SINGLE CELLS IN THE CNV DATA ARE IN THE TREE EDGES DATA
  scs_missing_from_tree <- setdiff(scs_in_hm, scs_in_tree)
  if (length(scs_missing_from_tree) > 0) {
    if (is.null(cnv_data)) {
      data_type <- "mutations"
    }
    else {
      data_type <- "cnv"
    }
    if (show_warnings) {
      print(paste("WARNING: The following single cell ID(s) are present in the ", data_type, " data but ",
          "are missing from the tree edges data: ",
          paste(scs_missing_from_tree, collapse=", "), 
          ". They will not be shown in the visualization.", sep=""))
    }
  }

  # SINGLE CELL GROUPS
  if (!is.null(sc_annot)) {
    # ensure column names are correct
    if (!("single_cell_id" %in% colnames(sc_annot)) ||
        !("genotype" %in% colnames(sc_annot))) {
      stop("Single cell group assignment data frame must have the following column names: ", 
          "\"single_cell_id\", \"genotype\"")
    }

    # ensure data is of the correct type
    sc_annot$single_cell_id <- as.character(sc_annot$single_cell_id)
    sc_annot$genotype <- as.character(sc_annot$genotype)
    if ("timepoint" %in% colnames(sc_annot)) {
      sc_annot$timepoint <- as.character(sc_annot$timepoint)
    }

    # ensure that at least one single cell in the tree has annotations
    scs_in_tree <- unique(c(tree_edges$source, tree_edges$target))
    scs_in_annots <- unique(sc_annot$single_cell_id)
    scs_in_tree_and_annots <- intersect(scs_in_tree, scs_in_annots)
    if (length(scs_in_tree_and_annots) == 0) {
      stop("The annotations parameter (sc_annots) has been used, but none of the single cells in the phylogeny have annotations.")
    }

    # remove all single cells that are not in the tree
    scs_in_groups <- unique(sc_annot$single_cell_id)
    scs_missing_from_tree <- setdiff(scs_in_groups,scs_in_tree)
    sc_annot <- sc_annot[which(!(sc_annot$single_cell_id %in% scs_missing_from_tree)),]

    # to JSON
    sc_annot_JSON <- jsonlite::toJSON(sc_annot)

    # note that annotations are provided
    sc_annot_provided <- TRUE
  }
  else {
    sc_annot_JSON <- sc_annot

    # note that annotations are NOT provided
    sc_annot_provided <- FALSE
  }

  # IF ALL NECESSARY PRAMETERS ARE PRESENT FOR TIMESCAPE
  if (!is.null(gtype_tree_edges) && 
      !is.null(sc_annot) && 
      ("timepoint" %in% colnames(sc_annot)) &&
      ("genotype" %in% colnames(sc_annot))) {

    # ensure correct type of genotype tree edge data

    gtype_tree_edges$source <- as.character(gtype_tree_edges$source)
    gtype_tree_edges$target <- as.character(gtype_tree_edges$target)

    # ENSURE THAT ALL GENOTYPES ARE PRESENT IN THE TREE

    gtypes_in_gtype_tree <- unique(c(gtype_tree_edges$source, gtype_tree_edges$target))
    gtypes_in_annots <- unique(sc_annot$genotype)
    gtypes_missing_from_tree <- setdiff(gtypes_in_annots, gtypes_in_gtype_tree)
    if (length(gtypes_missing_from_tree) > 0) {
      stop("The following genotype(s) are present in the single cell annotations data but ",
          "are missing from the genotype tree: ",
          paste(gtypes_missing_from_tree, collapse=", "), 
          ". All genotypes must be present in the genotype tree data.")
    }


    # CALCULATE CLONAL PREVALENCE FOR EACH SAMPLE

    # number of cells with each sample id
    annots_gb_tp <- dplyr::group_by(sc_annot, timepoint)
    samples <- dplyr::summarise(annots_gb_tp, n_in_sample = length(single_cell_id))

    # number of cells with each unique sample id, genotype combination
    annots_gb_tp_gtype <- dplyr::group_by(sc_annot, timepoint, genotype)
    genotypes_and_samples <- dplyr::summarise(annots_gb_tp_gtype, n = length(single_cell_id))

    # get clonal prevalence
    clonal_prev <- merge(samples, genotypes_and_samples, by=c("timepoint"), all.y=TRUE)
    clonal_prev$clonal_prev <- clonal_prev$n/clonal_prev$n_in_sample

    # rename columns
    colnames(clonal_prev)[which(colnames(clonal_prev) == "timepoint")] <- "timepoint"
    colnames(clonal_prev)[which(colnames(clonal_prev) == "genotype")] <- "clone_id"


    # GET INFORMATION FOR TIMESCAPE

    timescape_wanted <- TRUE
    mutations <- "NA"
    alpha <- 50 
    genotype_position <- "stack" 
    perturbations <- "NA" 
    sort <- FALSE 
    show_warnings <- TRUE

    timescape_userParams <- processUserData(clonal_prev, 
                                            gtype_tree_edges, 
                                            mutations,
                                            clone_colours, 
                                            as.character(xaxis_title), 
                                            as.character(yaxis_title), 
                                            as.character(phylogeny_title),
                                            alpha, 
                                            genotype_position, 
                                            perturbations, 
                                            sort, 
                                            show_warnings,
                                            width, 
                                            height)
  }
  else {
    timescape_userParams <- list()
    timescape_wanted <- FALSE
  }

  # forward options using x
  cellscape_userParams <- list(
    clone_cols = jsonlite::toJSON(clone_colours), # clone colours
    sc_annot=sc_annot_JSON, # single cells and their associated group ids
    sc_annot_provided=sc_annot_provided, # whether or not single cell annotations are provided by the user
    sc_tree_edges=jsonlite::toJSON(tree_edges_for_layout), # tree edges for phylogeny
    sc_tree_nodes=jsonlite::toJSON(tree_nodes_for_layout), # tree nodes for phylogeny
    link_ids=link_ids, # ids for all links in the phylogeny
    distances_provided=distances_provided, # whether or not distances are provided for tree edges
    chroms=chroms, # chromosomes
    chrom_boxes=jsonlite::toJSON(chrom_boxes), # chromosome legend boxes
    heatmap_info=jsonlite::toJSON(heatmap_info), # heatmap information 
    heatmap_type=heatmap_type, # type of data in heatmap (cnv or targeted)
    heatmapWidth=heatmapWidth, # width of the heatmap
    value_type=value_type, # type of value in the heatmap
    node_type=node_type, # type of node in single cell phylogeny
    timepoint_title=as.character(timepoint_title), # legend title for timepoints
    clone_title=as.character(clone_title), # legend title for timepoints
    root=root, # name of root
    display_node_ids=display_node_ids, # whether or not to display the node id labels on each node
    scs_missing_from_hm=scs_missing_from_hm, # single cells in tree but not heatmap
    continuous_cnv=continuous_cnv, # whether copy number data should be continuous or discrete
    timescape_wanted=timescape_wanted # type of time/space view provided (NULL, "time", or "space")
  )
  x = append(cellscape_userParams, timescape_userParams)

  # create widget
  htmlwidgets::createWidget(
    name = 'cellscape',
    x,
    width = width,
    height = height,
    package = 'cellscape'
  )
}

#' Get depth first search of a tree
#' @param edges -- edges of tree
#' @param cur_root -- current root of the tree
#' @param dfs_arr -- array of depth first search results to be filled
#' @export
#' @rdname helpers
#' @examples
#' dfs_tree(data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")), "1", c())
dfs_tree <- function(edges, cur_root, dfs_arr) {
  if (!is.null(cur_root)) {
    # add this root to the dfs list of nodes
    dfs_arr <- append(dfs_arr, cur_root)

    # get children of this root
    cur_children <- edges[which(edges$source == cur_root),"target"]
    for (cur_child in cur_children) {
      cur_root <- cur_child
      dfs_arr <- dfs_tree(edges, cur_root, dfs_arr)
    }
  }
  return(dfs_arr)
}

#' Widget output function for use in Shiny
#'
#' @param outputId -- id of output
#' @param width -- width of output
#' @param height -- height of output
#' @examples
#' cellscapeOutput(1, '100%', '300px')
#' cellscapeOutput(1, '80%', '300px')
#' @export
#' @rdname helpers
cellscapeOutput <- function(outputId, width = '100%', height = '400px'){
  shinyWidgetOutput(outputId, 'cellscape', width, height, package = 'cellscape')
}

#' Widget render function for use in Shiny
#'
#' @param expr -- expression for Shiny
#' @param env -- environment for Shiny
#' @param quoted -- default is FALSE 
#' @export
#' @rdname helpers
renderCnvTree <- function(expr, env = parent.frame(), quoted = FALSE) {
  if (!quoted) { expr <- substitute(expr) } # force quoted
  shinyRenderWidget(expr, cellscapeOutput, env, quoted = TRUE)
}

# CNVTREE HELPERS

#' Function to get data frame of pixels
#' @param hm_sc_ids_ordered -- array of single cell ids in order
#' @param ncols -- number of columns in heatmap/grid
#' @rdname helpers
getEmptyGrid <- function(hm_sc_ids_ordered, ncols) {
  sc_ids <- rep(hm_sc_ids_ordered, each=ncols)
  cols <- rep(seq(0:(ncols-1)), length(hm_sc_ids_ordered))

  return(data.frame(col=cols, sc_id=sc_ids, stringsAsFactors=FALSE))
}

#' function to get min and max values for each chromosome
#' @param chroms -- vector of chromosome names
#' @param cnv_data -- copy number data
#' @rdname helpers
getChromBounds <- function(chroms, cnv_data) {

  # get min & max for each chromosome
  chrom_bounds = sapply(chroms, function(chrom) {
    chrom_cnv_data <- cnv_data[which(cnv_data$chr == chrom),]
    start <- min(chrom_cnv_data$start)
    end <- max(chrom_cnv_data$end)
    return(c(chrom=chrom, bp_start=start, bp_end=end))
  })

  # flip data frame
  chrom_bounds_t <- as.data.frame(t(as.matrix(chrom_bounds)))

  # ensure correct data types
  chrom_bounds_t$bp_start <- as.numeric(as.character(chrom_bounds_t$bp_start))
  chrom_bounds_t$bp_end <- as.numeric(as.character(chrom_bounds_t$bp_end))

  # get the ADDITIVE chromosome start and end bps
  chrom_bounds_t$chrom_start <- NA
  chrom_bounds_t$chrom_end <- NA
  next_chr_start_bp <- 0 # base pair BEFORE the next chromosome
  for (i in 1:nrow(chrom_bounds_t)) {
    chrom_bounds_t$chrom_start[i] <- next_chr_start_bp
    this_chr_end_bp <- next_chr_start_bp + (chrom_bounds_t$bp_end[i] - chrom_bounds_t$bp_start[i])
    chrom_bounds_t$chrom_end[i] <- this_chr_end_bp
    next_chr_start_bp <- this_chr_end_bp + 1
  }

  # get the chromosome index
  chrom_bounds_t$chrom_index <- seq(1:nrow(chrom_bounds_t))

  return (chrom_bounds_t)
}

#' function to get chromosome box pixel info
#' @param chrom_bounds -- data frame of chromosome boundaries
#' @param n_bp_per_pixel -- integer of number of base pairs per pixel
#' @rdname helpers
getChromBoxInfo <- function(chrom_bounds, n_bp_per_pixel) {
  chrom_boxes <- data.frame(chr=chrom_bounds$chrom, 
                            # x coordinate (start of chromosome)
                            x=floor(chrom_bounds$chrom_start/n_bp_per_pixel) + 2*(chrom_bounds$chrom_index - 1), 
                            width=rep(-1, nrow(chrom_bounds)))
  # width = end - start + 1
  chrom_boxes$width <- (floor(chrom_bounds$chrom_end/n_bp_per_pixel) + 2*(chrom_bounds$chrom_index - 1)) - chrom_boxes$x + 1

  return(chrom_boxes)
}

#' function to get the genome length
#' @param chrom_bounds -- data frame of chromosome boundaries
#' @rdname helpers
getGenomeLength <- function(chrom_bounds) {

  tmp_chrom_bounds <- chrom_bounds
  tmp_chrom_bounds$n_bps <- tmp_chrom_bounds$bp_end - tmp_chrom_bounds$bp_start + 1
  genome_length <- sum(tmp_chrom_bounds$n_bps)

  return(genome_length)
}

#' function to get the number of base pairs per pixel
#' @param ncols --  integer of number of columns (pixels) to fill
#' @param chrom_bounds --  data frame of chromosome boundaries
#' @param genome_length --  integer of length of the genome
#' @rdname helpers
getNBPPerPixel <- function(ncols, chrom_bounds, genome_length) {
  n_data_pixels <- ncols - 2*(nrow(chrom_bounds) + 1) # number of pixels filled with data 
                                                      # (subtract number chromosome separators:
                                                      # - 1 for each separator
                                                      # - 1 for the end of each chromosome 
                                                      # (we don't want chromosomes to share pixels))
  n_bp_per_pixel <- ceiling(genome_length/n_data_pixels) # number of bps per pixel
  return(n_bp_per_pixel)
}

#' function to get information (chr, start, end, mode_cnv) for each pixel
#' @param cnv_data -- data frame of copy number variant segments data
#' @param chrom_bounds -- data frame of chromosome boundaries
#' @param n_bp_per_pixel -- integer of number of base pairs per pixel
#' @rdname helpers
getCNVHeatmapForEachSC <- function(cnv_data, chrom_bounds, n_bp_per_pixel) {

  # get the pixel start and end for each segment (account for chromosome separators in pixel info)
  heatmap_info <- cnv_data
  heatmap_info <- merge(cnv_data, chrom_bounds, by.x="chr", by.y="chrom")
  heatmap_info$start_px <- floor((heatmap_info$chrom_start + heatmap_info$start) / n_bp_per_pixel) + 2*(heatmap_info$chrom_index-1)
  heatmap_info$end_px <- floor((heatmap_info$chrom_start + heatmap_info$end) / n_bp_per_pixel) + 2*(heatmap_info$chrom_index-1)
  heatmap_info$px_width <- heatmap_info$end_px - heatmap_info$start_px + 1

  # note any segments whose start_px != end_px --> these will be the separated end pixels
  segment_ends_info <- heatmap_info[which(heatmap_info$start_px != heatmap_info$end_px),]

  # save their left ends (starts)
  starts <- segment_ends_info
  colnames(starts)[which(colnames(starts) == "start_px")] <- "px"
  starts <- starts[ , !(names(starts) %in% c("end_px"))] # drop end_px column
  starts$px_width <- 1 # set pixel width to 1

  # save their right ends (ends)
  ends <- segment_ends_info
  colnames(ends)[which(colnames(ends) == "end_px")] <- "px"
  ends <- ends[ , !(names(ends) %in% c("start_px"))] # drop start_px column
  ends$px_width <- 1 # set pixel width to 1

  # save their middles (for segments with length greater than 2)
  segs_gt_2 <- segment_ends_info[which(segment_ends_info$px_width > 2),]
  segs_gt_2$px_width <- segs_gt_2$px_width - 2 # subtract 2 (for 2 ends) from pixel width
  middles <- segs_gt_2[,c("single_cell_id", "start_px", "px_width", "copy_number", "chr", "chrom_index")]
  colnames(middles)[which(colnames(middles) == "start_px")] <- "px"
  colnames(middles)[which(colnames(middles) == "copy_number")] <- "mode_cnv"
  middles$px <- middles$px + 1 # first pixel will be a start pixel, so we shift 1

  # note any segments that occupy one pixel only
  singles <- heatmap_info[which(heatmap_info$start_px == heatmap_info$end_px),]
  colnames(singles)[which(colnames(singles) == "end_px")] <- "px"
  singles <- singles[ , !(names(singles) %in% c("start_px"))] # drop start_px column

  # bind starts, ends, and singles
  starts_ends_singles <- rbind(starts, ends, singles)

  # find the mode cnv of all starts, ends, and singles
  starts_ends_singles_grouped <- dplyr::group_by(starts_ends_singles, single_cell_id, px, px_width, chr, chrom_index)
  starts_ends_singles_w_mode <- dplyr::summarise(starts_ends_singles_grouped, mode_cnv=findMode(copy_number)[["mode"]])
  starts_ends_singles_w_mode <- as.data.frame(starts_ends_singles_w_mode)

  # bind the starts, ends, singles and middles
  all_pixels <- rbind(starts_ends_singles_w_mode, middles)
  all_pixels <- all_pixels[with(all_pixels, order(single_cell_id, px)), ]
  colnames(all_pixels)[which(colnames(all_pixels) == "single_cell_id")] <- "sc_id"

  # merge consecutive pixels with the same mode_cnv

  # remove NAs & Infs from mode cnv for cumsum calculation
  all_pixels$mode_cnv_no_NA <- all_pixels$mode_cnv
  all_pixels$mode_cnv_no_NA[which(is.na(all_pixels$mode_cnv_no_NA))] <- 1 
  all_pixels$mode_cnv_no_NA[which(all_pixels$mode_cnv_no_NA == Inf)] <- 1 
  # compute the lengths and values of runs of equal values vector of cnvs
  cnv_rle <- rle(all_pixels$mode_cnv_no_NA)
  # add 1 so zero copy numbers have an additive effect in the cumulative sum calculation
  all_pixels$cumsum_values <- rep.int(cumsum(cnv_rle$values + 1), cnv_rle$length)
  # add chromosome index to cumsum values to ensure ends of chromosomes are not merged if same cnv
  all_pixels$cumsum_values <- all_pixels$cumsum_values + all_pixels$chrom_index 
  all_pixels_select <- dplyr::select(all_pixels, sc_id, px, px_width, chr, mode_cnv, cumsum_values)
  all_pixels_grouped <- dplyr::group_by(all_pixels_select, sc_id, cumsum_values)
  consecutive_px_merged <- dplyr::summarise(all_pixels_grouped, sum_px_width=sum(px_width),
                                                                chr=chr[1],
                                                                px_min=min(px),
                                                                mode_cnv=mode_cnv[1])
  consecutive_px_merged <- as.data.frame(consecutive_px_merged)
  colnames(consecutive_px_merged) <- c("sc_id", "cumsum_values", "px_width", "chr", "x", "gridCell_value")
  # rearrange columns
  consecutive_px_merged <- consecutive_px_merged[,c("sc_id","x","px_width","chr","gridCell_value","cumsum_values")]

  # separate pixels by single cell id
  consecutive_px_merged_split <- split(consecutive_px_merged , f = consecutive_px_merged$sc_id)

  return (consecutive_px_merged_split)
}

#' function to get mutation order for targeted data
#' @param mut_data -- data frame of mutations data
#' @rdname helpers
getMutOrder <- function(mut_data) {
  separator <- ":"

  cur_data <- mut_data

  # group data by mutation site
  cur_data$VAF_rounded <- cur_data$VAF
  cur_data$VAF_rounded[which(cur_data$VAF_rounded < 0.05)] <- -10
  cur_data$VAF_rounded[which(cur_data$VAF_rounded >= 0.95)] <- 1
  cur_data$VAF_rounded[which(cur_data$VAF_rounded >= 0.05 & cur_data$VAF_rounded < 0.95)] <- 0.5
  cur_data$VAF_rounded[which(is.na(cur_data$VAF_rounded))] <- 0
  cur_data$VAF_rounded[which(is.infinite(cur_data$VAF_rounded))] <- 0

  # group data by mutation site -- get only site, rounded VAF and single cell id
  cur_data_for_mat <- cur_data[,c("site", "single_cell_id", "VAF_rounded")]

  # get a data frame of sites X single cell ID, containing rounded VAF
  mat <- reshape2::dcast(cur_data_for_mat, site ~ single_cell_id, value.var="VAF_rounded")
  rownames(mat) <- mat$site # set rownames to site names
  mat <- mat[, -which(colnames(mat) == "site")] # remove site column

  # hierarchically cluster mutations
  mut_dists <- dist(mat, method="euclidean")
  mut_clust <- hclust(mut_dists, method='complete')

  # get the order of hierarchically clustered mutations
  mut_order <- rownames(mat)[mut_clust$order]

  # get average VAF for each site
  cur_data_grouped_by_site <- dplyr::group_by(cur_data, site)
  mut_site_avg <- dplyr::summarise(cur_data_grouped_by_site, avg_VAF=sum(VAF, na.rm=TRUE)/length(VAF))

  # order average mutation VAFs by the order of mutations calculated by hierarchical clustering
  mut_site_avg_ordered <- mut_site_avg[match(mut_order, mut_site_avg$site),]

  # find slope of the ordered sites' VAFs
  model <- lm(formula = seq(1, nrow(mut_site_avg_ordered)) ~ mut_site_avg_ordered$avg_VAF, x=TRUE, y=TRUE, na.action=na.omit)
  slope <- coef(model)["mut_site_avg_ordered$avg_VAF"]

  # if the mutation VAFs are generally increasing, reverse order
  if (slope > 0) {
    return(rev(mut_order))
  }
  else {
    return(mut_order)
  }
}

#' function to get targeted heatmap information 
#' @param mut_data -- data frame of mutations data
#' @param mut_order -- array of order of mutations for heatmap (chromosome:coordinate)
#' @param heatmapWidth -- number for width of the heatmap (in pixels)
#' @rdname helpers
getTargetedHeatmapForEachSC <- function(mut_data, mut_order, heatmapWidth) {

  # sort mutations by single cell, genomic position
  heatmap_info <- mut_data
  heatmap_info$chr <- as.character(heatmap_info$chr)
  heatmap_info <- dplyr::arrange(heatmap_info, single_cell_id, chr, coord)

  # get mutation site as one string
  heatmap_info$site <- paste(trimws(heatmap_info$chr), trimws(heatmap_info$coord), sep=":")

  # get the number of sites
  n_sites <- length(unique(mut_order))

  # check that the number of mutation sites does not exceed 1 pixel per mutation site
  if (heatmapWidth < n_sites) {
    stop("The number of mutation sites (",n_sites,") cannot exceed the width of the plot (",
      heatmapWidth,"). Either reduce the number of mutation sites, or increase the plot width.")
  }

  # heatmap cell width
  mut_width <- heatmapWidth/n_sites

  # attach heatmap cell x information and width information to each mutation
  heatmap_info$x <- sapply(heatmap_info$site, function(site) {
      # index of this site in the list of mutation sites
      index_of_site <- which(mut_order == site) - 1
      return(index_of_site * mut_width);
    })
  heatmap_info$px_width <- mut_width

  # colname sc_id not single_cell_id
  colnames(heatmap_info)[which(colnames(heatmap_info) == "single_cell_id")] <- "sc_id"
  colnames(heatmap_info)[which(colnames(heatmap_info) == "VAF")] <- "gridCell_value"

  # separate pixels by single cell id
  heatmap_info_split <- split(heatmap_info , f = heatmap_info$sc_id)

  return (heatmap_info_split)
}

#' function to find the mode of a vector
#' @param x -- vector of numbers
#' @examples
#' findMode(c(1,1,19,1))
#' @export
#' @rdname helpers
findMode <- function(x) {
  ux <- unique(x) # each unique value
  n_appearances <- tabulate(match(x, ux)) # number of appearances for each unique value
  return(list(mode=ux[which.max(n_appearances)], n_with_max=max(n_appearances)))
}


# TIMESCAPE HELPERS

#' Function to process the user data
#' @param clonal_prev -- data frame of Clonal prevalence. Note: timepoints will be alphanumerically sorted in the view.
#'   Format: columns are (1) character() "timepoint" - time point
#'                       (2) character() "clone_id" - clone id
#'                       (3) numeric() "clonal_prev" - clonal prevalence.
#' @param tree_edges -- data frame of Tree edges of a rooted tree.
#'   Format: columns are (1) character() "source" - source node id
#'                       (2) character() "target" - target node id.
#' @param mutations -- data frame (Optional) of Mutations occurring at each clone. Any additional field will be shown in the mutation table.
#'   Format: columns are (1) character() "chrom" - chromosome number
#'                       (2) numeric() "coord" - coordinate of mutation on chromosome
#'                       (3) character() "clone_id" - clone id
#'                       (4) character() "timepoint" - time point
#'                       (5) numeric() "VAF" - variant allele frequency of the mutation in the corresponding timepoint. 
#' @param clone_colours -- data frame (Optional) of Clone ids and their corresponding colours 
#'   Format: columns are (1) character() "clone_id" - the clone ids
#'                       (2) character() "colour" - the corresponding Hex colour for each clone id.
#' @param xaxis_title -- String (Optional) of x-axis title. Default is "Time Point".
#' @param yaxis_title -- String (Optional) of y-axis title. Default is "Clonal Prevalence".
#' @param phylogeny_title -- String (Optional) of Legend phylogeny title. Default is "Clonal Phylogeny".
#' @param alpha -- Number (Optional) of Alpha value for sweeps, range [0, 100].
#' @param genotype_position -- String (Optional) of How to position the genotypes from ["centre", "stack", "space"] 
#'   "centre" -- genotypes are centred with respect to their ancestors
#'   "stack" -- genotypes are stacked such that no genotype is split at any time point
#'   "space" -- genotypes are stacked but with a bit of spacing at the bottom
#' @param perturbations -- data frame (Optional) of any perturbations that occurred between two time points.
#'   Format: columns are (1) character() "pert_name" - the perturbation name
#'                       (2) character() "prev_tp" - the time point (as labelled in clonal prevalence data) 
#'                                                BEFORE perturbation.
#' @param sort -- Boolean (Optional) of whether (TRUE) or not (FALSE) to vertically sort the genotypes by their emergence values (descending). 
#'                       Default is FALSE. 
#'                       Note that genotype sorting will always retain the phylogenetic hierarchy, and this parameter will only affect the ordering of siblings.
#' @param show_warnings -- Boolean (Optional) of  Whether or not to show any warnings. Default is TRUE.
#' @param width -- Number (Optional) of width of the plot. Minimum width is 450.
#' @param height -- Number (Optional) of height of the plot. Minimum height with and without mutations is 500 and 260, respectively. 
#' @rdname helpers
processUserData <- function(clonal_prev, 
                      tree_edges, 
                      mutations,
                      clone_colours, 
                      xaxis_title, 
                      yaxis_title, 
                      phylogeny_title,
                      alpha, 
                      genotype_position, 
                      perturbations, 
                      sort, 
                      show_warnings,
                      width, 
                      height) {

  # ENSURE MINIMUM DIMENSIONS SATISFIED
  checkMinDims(mutations, height, width)

  # CHECK REQUIRED INPUTS ARE PRESENT
  checkRequiredInputs(clonal_prev, tree_edges)

  # ALPHA VALUE
  checkAlpha(alpha)

  # SORTED GENOTYPES
  if (!is.logical(sort)) {
    stop("Sort parameter must be a boolean.")
  }

  # CLONAL PREVALENCE DATA
  clonal_prev <- checkClonalPrev(clonal_prev)

  # TREE EDGES DATA
  tree_edges <- checkTreeEdges(tree_edges)

  # GENOTYPE POSITIONING
  checkGtypePositioning(genotype_position)

  # CHECK CLONE COLOURS
  checkCloneColours(clone_colours) 

  # CHECK PERTURBATIONS
  perturbations <- checkPerts(perturbations)

  # MUTATIONS DATA
  mut_data <- getMutationsData(mutations, tree_edges, clonal_prev)
  mutation_info <- mut_data$mutation_info
  mutation_prevalences <- mut_data$mutation_prevalences
  if (is.data.frame(mutations)) {
    mutations_provided <- TRUE
  }
  else {
    mutations_provided <- FALSE
  }

  # REPLACE SPACES WITH UNDERSCORES
  spaces_replaced <- replaceSpaces(clonal_prev, tree_edges, clone_colours, mutation_info, mutations, mutation_prevalences)
  timepoint_map <- spaces_replaced$timepoint_map 
  clone_id_map <- spaces_replaced$clone_id_map 
  clonal_prev <- spaces_replaced$clonal_prev 
  tree_edges <- spaces_replaced$tree_edges
  mutation_info <- spaces_replaced$mutation_info
  clone_colours <- spaces_replaced$clone_colours
  mutation_prevalences <- spaces_replaced$mutation_prevalences

  # forward options using x
  return(list(
    clonal_prev = jsonlite::toJSON(clonal_prev),
    gtype_tree_edges = jsonlite::toJSON(tree_edges),
    clone_cols = jsonlite::toJSON(clone_colours),
    mutations = jsonlite::toJSON(mutation_info),
    mutation_prevalences = jsonlite::toJSON(mutation_prevalences),
    mutations_provided=mutations_provided, # whether or not mutations are provided
    xaxis_title = as.character(xaxis_title),
    yaxis_title = as.character(yaxis_title),
    phylogeny_title = as.character(phylogeny_title),
    alpha = alpha,
    genotype_position = genotype_position,
    perturbations = jsonlite::toJSON(perturbations),
    sort_gtypes = sort,
    timepoint_map = jsonlite::toJSON(timepoint_map),
    clone_id_map = jsonlite::toJSON(clone_id_map)
  ))
}

#' Function to check minimum dimensions
#' 
#' @param mutations -- mutations provided by user
#' @param height -- height provided by user
#' @param width -- width provided by user
#' @examples
#' checkMinDims(data.frame(chr = c("11"), coord = c(104043), VAF = c(0.1)), "700px", "700px")
#' @export
#' @rdname helpers
checkMinDims <- function(mutations, height, width) {

  # set height if not set by user
  if (is.null(height)) {
    if (!is.data.frame(mutations)) { # no mutations
      height = 260
    }
    else { # mutations
      height = 500
    }
  }

  # check height is big enough 
  min_width = 450
  if (!is.data.frame(mutations)) { # no mutations
    min_height = 260
  }
  else { # mutations
    min_height = 500
  }

  if (height < min_height) {
    stop("Height must be greater than or equal to ", min_height, "px.")
  }
  if (width < min_width) {
    stop("Width must be greater than or equal to ", min_width, "px.")
  }
}


#' Function to check required inputs are present
#' 
#' @param clonal_prev -- clonal_prev provided by user
#' @param tree_edges -- tree_edges provided by user
#' @examples
#' checkRequiredInputs(data.frame(timepoint = c(rep("Diagnosis", 6), rep("Relapse", 1)), clone_id = c("1","2","3","4","5","6","7"), clonal_prev = c("0.1","0.22","0.08","0.53","0.009","0.061","1")), 
#' data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")))
#' checkRequiredInputs(data.frame(timepoint = c(rep("Diagnosis", 6), rep("Relapse", 1)), clone_id = c("1","2","3","4","5","6","7"), clonal_prev = c("0.12","0.12","0.18","0.13","0.009","0.061","1")), 
#' data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")))
#' @export
#' @rdname helpers
checkRequiredInputs <- function(clonal_prev, tree_edges) {

  if (missing(clonal_prev)) {
    stop("Clonal prevalence data frame must be provided.")
  }
  if (missing(tree_edges)) {
    stop("Tree edge data frame must be provided.")
  }
}

#' check alpha value input is correct
#' 
#' @param alpha -- alpha provided by user
#' @examples
#' checkAlpha(4)
#' checkAlpha(100)
#' @export
#' @rdname helpers
checkAlpha <- function(alpha) {
  if (!is.numeric(alpha)) {
    stop("Alpha value must be numeric.")
  }

  if (alpha < 0 || alpha > 100) {
    stop("Alpha value must be between 0 and 100.")
  }
}

#' check clonal_prev parameter data
#'
#' @param clonal_prev -- clonal prevalence provided by user
#' @examples
#' checkClonalPrev(data.frame(timepoint=c(1), clone_id=c(2), clonal_prev=c(0.1)))
#' @export
#' @rdname helpers
checkClonalPrev <- function(clonal_prev) {

  # ensure column names are correct
  if (!("timepoint" %in% colnames(clonal_prev)) ||
      !("clone_id" %in% colnames(clonal_prev)) ||
      !("clonal_prev" %in% colnames(clonal_prev))) {
    stop("Clonal prevalence data frame must have the following column names: ", 
        "\"timepoint\", \"clone_id\", \"clonal_prev\"")
  }

  # ensure data is of the correct type
  clonal_prev$timepoint <- as.character(clonal_prev$timepoint)
  clonal_prev$clone_id <- as.character(clonal_prev$clone_id)
  clonal_prev$clonal_prev <- as.numeric(as.character(clonal_prev$clonal_prev))

  return(clonal_prev)
}

#' check tree_edges parameter data
#'
#' @param tree_edges -- tree edges provided by user
#' @examples
#' checkTreeEdges(data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")))
#' @export
#' @rdname helpers
checkTreeEdges <- function(tree_edges) {

  # ensure column names are correct
  if (!("source" %in% colnames(tree_edges)) ||
      !("target" %in% colnames(tree_edges))) {
    stop("Tree edges data frame must have the following column names: ", 
        "\"source\", \"target\"")
  }

  # ensure data is of the correct type
  tree_edges$source <- as.character(tree_edges$source)
  tree_edges$target <- as.character(tree_edges$target)

  # check for tree rootedness
  sources <- unique(tree_edges$source)
  targets <- unique(tree_edges$target)
  sources_for_iteration <- sources # because we will be changing the sources array over time
  for (i in 1:length(sources_for_iteration)) {
    cur_source <- sources_for_iteration[i]

    # if the source is a target, remove it from the sources list
    if (cur_source %in% targets) {
      sources <- sources[sources != cur_source]
    }
  }

  # if multiple roots are detected, throw error
  if (length(sources) > 1) {
    stop("Multiple roots detected in tree (",paste(sources,collapse=", "),
      ") - tree must have only one root.")
  }

  # if an edge is found whose source and target are equal, throw an error
  if (length(which(as.character(tree_edges$source) == as.character(tree_edges$target))) > 0) {
    stop("One of the tree edges has a source as its own target. Remove this edge.")
  }

  return(tree_edges)
}


#' check genotype_position parameter
#'
#' @param genotype_position -- genotype_position provided by user
#' @examples
#' checkGtypePositioning("centre")
#' @export
#' @rdname helpers
checkGtypePositioning <- function(genotype_position) {
  if (!(genotype_position %in% c("stack", "centre", "space"))) {
    stop("Genotype position must be one of c(\"stack\", \"centre\", \"space\")")
  }
}

#' check clone_colours parameter
#'
#' @param clone_colours -- clone_colours provided by user
#' @examples
#' checkCloneColours(data.frame(clone_id = c("1","2","3", "4"), colour = c("#beaed4", "#fdc086", "#beaed4", "#beaed4")))
#' @export
#' @rdname helpers
checkCloneColours <- function(clone_colours) {
  if (is.data.frame(clone_colours)) {

    # ensure column names are correct
    if (!("clone_id" %in% colnames(clone_colours)) ||
        !("colour" %in% colnames(clone_colours))) {
      stop("Node colour data frame must have the following column names: ", 
          "\"clone_id\", \"colour\"")
    }
  }
}

#' check perturbations parameter
#'
#' @param perturbations -- perturbations provided by user
#' @examples
#' checkPerts(data.frame(pert_name = c("New Drug"), prev_tp = c("Diagnosis")))
#' @export
#' @rdname helpers
checkPerts <- function(perturbations) {

  if (is.data.frame(perturbations)) {

    # ensure column names are correct
    if (!("pert_name" %in% colnames(perturbations)) ||
        !("prev_tp" %in% colnames(perturbations))) {
      stop("Perturbations data frame must have the following column names: ", 
          "\"pert_name\", \"prev_tp\"")
    }

    # check that columns are of the correct type
    perturbations$pert_name <- as.character(perturbations$pert_name)
    perturbations$prev_tp <- as.character(perturbations$prev_tp)
  }

  return(perturbations)
}

#' get mutation data
#'
#' @param mutations -- mutations data from user
#' @param tree_edges -- tree edges data from user
#' @param clonal_prev -- clonal prevalence data from user
#' @examples
#' getMutationsData(data.frame(chrom = c("11"), coord = c(104043), VAF = c(0.1), clone_id=c(1), timepoint=c("Relapse")), 
#' data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")), 
#' data.frame(timepoint = c(rep("Diagnosis", 6), rep("Relapse", 1)), clone_id = c("1","2","3","4","5","6","7"), clonal_prev = c("0.12","0.12","0.18","0.13","0.009","0.061","1")))
#' @export
#' @rdname helpers
getMutationsData <- function(mutations, tree_edges, clonal_prev) {

  if (is.data.frame(mutations)) {

    # ensure column names are correct
    if (!("chrom" %in% colnames(mutations)) ||
        !("coord" %in% colnames(mutations)) ||
        !("clone_id" %in% colnames(mutations)) ||
        !("timepoint" %in% colnames(mutations)) ||
        !("VAF" %in% colnames(mutations))) {
      stop("Mutations data frame must have the following column names: ", 
          "\"chrom\", \"coord\", \"clone_id\", \"timepoint\", \"VAF\".")
    }

    # ensure data is of the correct type
    mutations$chrom <- toupper(as.character(mutations$chrom)) # upper case X & Y
    mutations$coord <- as.character(mutations$coord)
    mutations$timepoint <- as.character(mutations$timepoint)
    mutations$clone_id <- as.character(mutations$clone_id)
    mutations$VAF <- as.numeric(as.character(mutations$VAF))

    # check for optional info, and ensure data of correct type
    extra_columns <- colnames(mutations)[which(!(colnames(mutations) %in% c("chrom", "coord", "clone_id", "timepoint", "VAF")))]
    mutations <- data.frame(lapply(mutations, as.character), stringsAsFactors=FALSE)

    # check that all CLONE IDS in the mutations data are present in the tree data
    mutations_clone_ids <- unique(mutations$clone_id)
    tree_edges_clone_ids <- c(unique(tree_edges$source), unique(tree_edges$target))
    clone_ids_missing_from_tree_edges_data <- setdiff(mutations_clone_ids, tree_edges_clone_ids)
    if (length(clone_ids_missing_from_tree_edges_data) > 0) {
      stop("The following clone ID(s) are present in the mutations data but ",
        "are missing from the tree edges data: ",
        paste(clone_ids_missing_from_tree_edges_data, collapse=", "), ".")
    }

    # check that all TIMEPOINTS in the mutations data are present in the clonal prev data
    mutations_tps <- unique(mutations$timepoint)
    clonal_prev_tps <- unique(clonal_prev$timepoint)
    tps_missing_from_clonal_prev_data <- setdiff(mutations_tps, clonal_prev_tps)
    if (length(tps_missing_from_clonal_prev_data) > 0) {
      stop("The following timepoint(s) are present in the mutations data but ",
        "are missing from the clonal prevalence data: ",
        paste(tps_missing_from_clonal_prev_data, collapse=", "), ".")
    }

    # create a location column, combining the chromosome and the coordinate
    mutations$location <- apply(mutations[, c("chrom","coord")], 1 , paste, collapse = ":")

    # coordinate is now a number
    mutations$coord <- as.numeric(as.character(mutations$coord))

    # check X & Y chromosomes are labelled "X" and "Y", not "23", "24"
    num_23 <- mutations[which(mutations$chrom == "23"),]
    if (nrow(num_23) > 0) {
      stop("Chromosome numbered \"23\" was detected in mutations data frame - X and Y chromosomes ",
        "must be labelled \"X\" and \"Y\".")
    }


    # get list of clones in the phylogeny
    clones_in_phylo <- unique(c(tree_edges$source, tree_edges$target))

    # keep only those mutations whose clone ids are present in the phylogeny
    mutations <- mutations[which(mutations$clone_id %in% clones_in_phylo),]

    # MUTATION PREVALENCES DATA

    mutation_prevalences <- mutations

    # keep only those mutations whose clone ids are present in the phylogeny
    mutation_prevalences <- mutation_prevalences[which(mutation_prevalences$clone_id %in% clones_in_phylo),]

    # warn if more than 10,000 rows in data that the visualization may be slow
    if (nrow(mutation_prevalences) > 10000 && show_warnings) {
      print(paste("[WARNING] Number of rows in mutations data exceeds 10,000. ",
        "Resultantly, visualization may be slow. ",
        "It is recommended to filter the data to a smaller set of mutations.", sep=""))
    }

    # compress results
    prevs_split <- split(mutation_prevalences, f = mutation_prevalences$location)

    # reduce the size of the data frame in each list
    prevs_split_small <- lapply(prevs_split, function(prevs) {
      return(prevs[,c("timepoint", "VAF")])
    })


    # MUTATION INFO 
    mutation_info <- unique(mutations[,c("chrom","coord","clone_id",extra_columns)])
  }
  else {
    prevs_split_small <- "NA"
    mutation_info <- "NA"
  }

  return(list("mutation_info"=mutation_info, "mutation_prevalences"=prevs_split_small))
}

#' function to replace spaces with underscores in all data frames & keep maps of original names to space-replaced names
#' @param clonal_prev -- clonal_prev data from user
#' @param tree_edges -- tree edges data from user
#' @param clone_colours -- clone_colours data from user
#' @param mutation_info -- processed mutation_info 
#' @param mutations -- mutations data from user
#' @param mutation_prevalences -- mutation_prevalences data from user
#' @export
#' @rdname helpers
#' @examples
#' replaceSpaces(mutations = data.frame(chrom = c("11"), coord = c(104043), VAF = c(0.1), clone_id=c(1), timepoint=c("Relapse")), 
#' tree_edges = data.frame(source = c("1","1","2","2","5","6"), target=c("2","5","3","4","6","7")), 
#' clonal_prev = data.frame(timepoint = c(rep("Diagnosis", 6), rep("Relapse", 1)), clone_id = c("1","2","3","4","5","6","7"), clonal_prev = c("0.12","0.12","0.18","0.13","0.009","0.061","1")),
#' mutation_prevalences = list("X:6154028" = data.frame(timepoint = c("Diagnosis"), VAF = c(0.5557))), mutation_info=data.frame(clone_id=c(1)),
#' clone_colours = data.frame(clone_id = c("1","2","3", "4"), colour = c("#beaed4", "#fdc086", "#beaed4", "#beaed4")))
replaceSpaces <- function(clonal_prev, tree_edges, clone_colours, mutation_info, mutations, mutation_prevalences) {

  # create map of original sample ids to space-replaced sample ids
  timepoint_map <- data.frame(original_timepoint = unique(clonal_prev$timepoint), stringsAsFactors=FALSE)
  timepoint_map$space_replaced_timepoint <- stringr::str_replace_all(timepoint_map$original_timepoint,"\\s+","_")

  # create map of original clone ids to space-replaced clone ids
  clone_id_map <- data.frame(original_clone_id = unique(c(tree_edges$source, tree_edges$target)), stringsAsFactors=FALSE)
  clone_id_map$space_replaced_clone_id <- stringr::str_replace_all(clone_id_map$original_clone_id,"\\s+","_")

  # replace spaces with underscores
  # --> timepoints
  clonal_prev$timepoint <- stringr::str_replace_all(clonal_prev$timepoint,"\\s+","_")
  if (is.data.frame(mutations)) {
    mutation_prevalences <- lapply(mutation_prevalences, function(prevs) {
      prevs$timepoint <- stringr::str_replace_all(prevs$timepoint,"\\s+","_")
      return(prevs)
    })
  }
  # --> clone ids
  clonal_prev$clone_id <- stringr::str_replace_all(clonal_prev$clone_id,"\\s+","_")
  tree_edges$source <- stringr::str_replace_all(tree_edges$source,"\\s+","_")
  tree_edges$target <- stringr::str_replace_all(tree_edges$target,"\\s+","_")
  if (is.data.frame(clone_colours)) {
    clone_colours$clone_id <- stringr::str_replace_all(clone_colours$clone_id,"\\s+","_")
  }
  if (is.data.frame(mutations)) {
    mutation_info$clone_id <- stringr::str_replace_all(mutation_info$clone_id,"\\s+","_")
  }

  return(list("timepoint_map"=timepoint_map, 
              "clone_id_map"=clone_id_map, 
              "clonal_prev"=clonal_prev, 
              "tree_edges"=tree_edges,
              "mutation_info"=mutation_info,
              "clone_colours"=clone_colours,
              "mutation_prevalences"=mutation_prevalences))
}
Any scripts or data that you put into this service are public.
cellscape documentation built on Nov. 8, 2020, 5:02 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
cellscape
Explores single cell copy number profiles in the context of a single cell tree

R/cellscape.R
In cellscape: Explores single cell copy number profiles in the context of a single cell tree

Try the cellscape package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

cellscape Explores single cell copy number profiles in the context of a single cell tree

R/cellscape.R In cellscape: Explores single cell copy number profiles in the context of a single cell tree

Try the cellscape package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

cellscape
Explores single cell copy number profiles in the context of a single cell tree

R/cellscape.R
In cellscape: Explores single cell copy number profiles in the context of a single cell tree