knitr::opts_chunk$set(echo = TRUE) library(qfm)
In this example, we demonstrate quantitative fate mapping on lineage barcode data measured with cell types. First, we apply Phylotime to reconstruct time-scaled phylogeny from lineage barcodes. Second, ICE-FASE is applied to extract parameters of progenitor population dynamics, including commitment times, progenitor population sizes and commitment biases.
# read mutant allele matrix cell_allele_table = readr::read_csv("../data/example/mutant_allele_matrix.csv") cell_allele_table[1:5]
The input to Phylotime is a character matrix, with rows corresponding to cells and columns corresponding to barcoding sites, tibble / data.frame are convert to a character matrix. The matrix needs to have rownames, which will be used as the names of the cells.
Within each barcoding site (column), the same character denotes a unique mutant allele. The unmutated allele needs to be coded as the character "0". (zero) The user also needs to input the total amount of active barcoding time, which is the duration for which mutations are being actively generated.
Phylotime estimates mutation rate from data provided and assumes a uniform allele emergence probability by default. More on how to set user-defined mutation parameters when prior information is available.
By default, Phylotime uses future to parallel pairwise distance estimations, use plan to set up parallel method and number of workers. Alternatively, set 'parallel=F'.
In this example, barcoding activates at 0.6 days, and samples are collected at 15 days.
chr_mat = as.matrix(cell_allele_table[-c(1:2)]) rownames(chr_mat) = cell_allele_table$cell library(furrr) plan(multisession, workers = 8) tr = phylotime(chr_mat[, 1:50], t_total = 15. - 0.6)
Now to visualize the reconstructed tree with the lineage barcodes:
plot_barcodes(chr_mat[, 1:50], tr, show_column_names = F)
To plot the reconstrcuted tree only:
plot_tr(tr, end_alpha_terminal = 0.05, edge_width_terminal = 0.02)
Next, to apply ICE-FASE, we need, in additional, the terminal cell state classifications. This is provided as a named vector, where the name correspond to cell names. We can visualzie the tree with terminal cell states:
sc_celltypes = cell_allele_table$type names(sc_celltypes) = cell_allele_table$cell print(sc_celltypes[1:10]) plot_barcodes(chr_mat[, 1:50], tr, tip_celltype = sc_celltypes, show_column_names = F)
To run ICE-FASE: ('root_time' is the time until barcode activation.)
res = ice_fase(tr, sc_celltypes, total_time = 15 - 0.6, root_time = 0.6)
With ICE-FASE fitted, a number of progenitor states are identified as nodes in the fate map topology. For future visualizations, we set the color scheme for the progenitor and terminal states. A default color palette is generated. If you wish to set you own color scheme, use the 'palette' argument. The vector of colors corresponds to the vector c(res$gr$node.label, res$gr$tip.label)
. For example, with 16 terminal states, 15 progenitor states were identified and 15 + 16 total colors need to be provided.
res = set_color_palette(res)
We can plot the reconstructed fate map topology with commitment times now. Note that the inferred progenitor states (iP) are denoted as "Node-x" below.
plot_gr(gr = res$gr, total_time = 15, gr_node_time = res$gr_trans_time, type_col = res$col_pal)
To visualize the internal node state assignment in reconstructed phylogeny:
plot_tr(res$tr, node_types = res$tr_node_assign, type_col = res$col_pal, end_alpha_terminal = 0.05, edge_width_terminal = 0.02)
To visualize distribution of Inferred Commitment Event (ICE) times:
plot_ice_times(res)
To visualize progenitor population sizes:
plot_node_sizes(res)
Finally, we can get a summary table for the progenitor state estimates:
output_estimates(res)
Most single cell lineage barcode data contains missing alleles for some cells, we use 'xgboost' predictions to impute the missing characters. Barcoding sites were imputed one by one going from the one with fewest cells missing to the most cells missing. The imputed character matrix can be used for downstream pipelines. The example below randomly dropped 10% of all the alleles.
set.seed(73) chr_mat_missing = rlang::duplicate(chr_mat[, 1:50]) missing_indices = sample(c(T, F), size = length(chr_mat[, 1:50]), prob = c(0.1, 0.9), replace = T) chr_mat_missing[missing_indices] = NA chr_mat_imputed = impute_characters(chr_mat_missing, nrounds = 20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.