get_evofreq: get_evofreq Collect information to plot frequency dynamics

Description Usage Arguments Value Examples

View source: R/EvoFreq_funcs.R

Description

get_evofreq

Collect information to plot frequency dynamics

Usage

1
2
3
4
5
6
7
get_evofreq(size_df, clones, parents, fill_value = NULL,
  fill_range = NULL, time_pts = NULL, clone_cmap = NULL,
  threshold = 0.01, scale_by_sizes_at_time = FALSE,
  data_type = "size", interpolation_steps = 20,
  interp_method = "monoH.FC", fill_gaps_in_size = FALSE,
  test_links = TRUE, add_origin = FALSE, tm_frac = 0.6,
  rescale_after_thresholding = FALSE, shuffle_colors = FALSE)

Arguments

size_df

Dataframe in a wide format, where each row corrsponds to a single clone, and the columns are the sizes of that clone at each timepoint

clones

Array containing the clone ids. The index of each clone must correspond to the same index of the row in size_df that contains the sizes of that clone over time

parents

Array containing the ids of the parent of each clone in the clones array.

fill_value

Array containing information that can be used to color each clone. If NULL (the default), each clone is assigned a color. If values are a clone attribute, e.g. fitness, then the colors are assigned according to those values. The user can also provide custom colors in 3 ways: 1) hexcode; 2) rgb values as a string, with each value being a the intensity of the color channel, each separated by commas, e.g. "255, 10, 128"; 3) Any of the named in colors in R, which can be found with colors

fill_range

Array containing the minimum and maximum values to set the range of colors. If NULL (the default), the range is determined directly from fill_value.

time_pts

Array containing the name of the timepoints. If NULL, then the name of timepoints will be a sequence from 1 to the number of columns in size_df.

clone_cmap

Colormap to use for the clones. For a list of available colormaps, see https://github.com/bhaskarvk/colormap.

threshold

The minimum frequency of clones to be plotted. Clones with with a frequency below this value will not be plotted

scale_by_sizes_at_time

Boolean defining whether or not the plot should represent the size or frequency of each clone at each timepoint. If TRUE, the sizes are scaled by the maximum size at each timepoint, and the plot thus represents the clonal frequencies at each timepoint. If FALSE, the sizes are scaled using the maximum size in size_df, thus reflecting relative population sizes

data_type

String defining what kind of information is in size_df. If "size", then the values in size_df are the population sizes. If "mutation", the values are the frequencies, between 0 and 1, of each mutation in the population over time

interpolation_steps

Integer defining the number of knots to use in the spline interpolation used to fill in the gaps between observed population sizes. For sparse data, this smooths out the curves in the plot. Not recommended if the data is dense, as this is slow and may not have noticable effects

interp_method

String identifying the interpolation method to use. Either "bezier", or a method used by splinefun

fill_gaps_in_size

Boolean defining whether or not missing sizes should be filled in

test_links

Make sure clone does not have the same id as it's parent. If true, it can cause infinite recursion.

add_origin

Boolean defining whether or not to add origin positions to founder clones, even if not present in the data. Best for sparse observed data

tm_frac

Value between 0 and 1 that determines where the maximum growth rate is in the inferred origin sizes. Lower values result in earlier maximum growth

rescale_after_thresholding

Boolean determining if frequencies should be rescaled after thresholding, so that frequencies are based on what was above the threshold.

shuffle_colors

Boolean determining if colors should be shuffled before being assigned to each clone. Only applies when fill_value = NULL

Value

Formatted dataframe called a "freq_frame" containing the information needed to plot the frequency dynamics over time.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
data("example.easy.wide")
### Split dataframe into clone info and size info using fact timepoint column names can be converted to numeric values
time_col_idx <- suppressWarnings(which(! is.na(as.numeric(colnames(example.easy.wide)))))
size_df <- example.easy.wide[, time_col_idx]
parents <- example.easy.wide$parents
clones <- example.easy.wide$clones

### Default is to plot size
freq_frame <- get_evofreq(size_df, clones, parents)
evo_p_by_size <- plot_evofreq(freq_frame)

### Can also plot frequency by setting scale_by_sizes_at_time = TRUE.
freq_frame <- get_evofreq(size_df, clones, parents, scale_by_sizes_at_time = TRUE)
evo_p_by_freq <- plot_evofreq(freq_frame)

### Default is to mildly smooth corners, but this can be turned by setting interpolation_steps = 0
freq_frame <- get_evofreq(size_df, clones, parents, interpolation_steps = 0)
raw_evo_p <- plot_evofreq(freq_frame)

### Several other methods to smooth corners, including using Bezier curves. However, Bezier curves dont represent the data as accurately as the methods that use splinefun, i.e. c("fmm", "periodic", "natural", "monoH.FC", "hyman")
freq_frame <- get_evofreq(size_df, clones, parents, interp_method = "bezier")
bez_evo_p <- plot_evofreq(freq_frame)

### Data can also be provided as mutaiton frequencies by setting data_type = "mutation"
mutation_count_df <- get_mutation_df(size_df, clones, parents)
freq_frame <- get_evofreq(mutation_count_df, clones, parents, data_type = "mutation")
evo_p_from_mutation <- plot_evofreq(freq_frame)

### Input needs to be in wide format, but can be converted to long format data to wide format using \code{\link{long_to_wide_freqframe}}
wide_df_info <- long_to_wide_freqframe(long_pop_sizes_df = example.easy.long.sizes, time_col_name = "Time", clone_col_name = "clone", parent_col_name = "parent", size_col_name = "Size", edges_df = example.easy.long.edges)
clones_from_long <- wide_df_info$clones
parents_from_long <- wide_df_info$parents
size_df_from_long <- wide_df_info$wide_size_df
freq_frame <- get_evofreq(size_df_from_long, clones_from_long, parents_from_long)
evo_p_from_long <- plot_evofreq(freq_frame)

### Setting of colors can be done when getting the freq_frame, or by updating the color later using \code{\link{update_colors}}. For a list of available colormaps, see https://github.com/bhaskarvk/colormap.
### Default colormap is rainbow_soft, but this can be changed using the \code{clone_cmap} argument. 
jet_freq_frame <- get_evofreq(size_df, clones, parents, clone_cmap = "jet")
jet_evo_p <- plot_evofreq(jet_freq_frame)

### Can color each clone by an attribute by providing a \code{fill_value}. Default colormap is viridis, but this can be changed using the \code{clone_cmap} argument
fitness <- runif(length(clones))
fitness_freq_frame <- get_evofreq(size_df, clones, parents, fill_value = fitness)
fitness_evo_p <- plot_evofreq(fitness_freq_frame)

### The user can also provide custom colors for each clone, which will need to be passed into the \code{fill_value} argument
### Custom colors can be defined using RGB values. Each color should be a string specifying the color channel values, separated by commas.
rgb_clone_colors <- sapply(seq(1, length(clones)), function(x){paste(sample(0:255,size=3,replace=TRUE),collapse=",")})
rgb_freq_frame <- get_evofreq(size_df, clones, parents, rgb_clone_colors)
rgb_evo_p <- plot_evofreq(rgb_freq_frame)

### Custom colors can also be any of the named colors in R. A list of the colors can be found with \code{colors()}
named_clone_colors <- sample(colors(), length(clones), replace = FALSE)
named_freq_frame <- update_colors(rgb_freq_frame, clones = clones, fill_value = named_clone_colors)
named_evo_p <- plot_evofreq(named_freq_frame)

### Custom colors can also be specified using hexcode
hex_clone_colors <- c("#614099ff", "#1d347eff", "#94558aff", "#c96872ff", "#f1884dff", "#e8fa5bff", "#042333ff","#f9bb41ff")
hex_freq_frame <- update_colors(rgb_freq_frame, clones = clones, fill_value = hex_clone_colors)
hex_evo_p <- plot_evofreq(hex_freq_frame)

### Can revert back to original colors
freq_frame_default_color <- update_colors(fitness_freq_frame, clones=clones)
default_cmap_evo_p <- plot_evofreq(freq_frame_default_color)

MathOnco/EvoFreq documentation built on Jan. 26, 2022, 7:31 p.m.