README.md
In DCLEAR: Distance Based Cell Lineage Reconstruction

Distance based Cell LinEAge Reconstruction (DCLEAR)

Il-Youp Kwak (ikwak2@cau.ac.kr) and Wuming Gong (gongx030@umn.edu)

R/DCLEAR is an R package for Distance based Cell LinEAge Reconstruction(DCLEAR). These codes are created during the participation of Cell Lineage Reconstruction DREAM challenge.

Figure 1. Overview of DCLEAR modeling architecture. Our model is divided into two parts, 1) estimating distance between cells and 2) constructing tree using distance matrix. drawing

Naive approach would be the hamming distance that simply calculate the edit distance.

drawing

However, the previous approach assume every base difference have same weights. For example, two sequences, '00AB0' and '0-CB0', are different at second and third positions. The second position, we have '0' and '-', and the third position, we have 'A' and 'C'.

For '0' and '-', '-' is point missing and it is possibly '0'. Thus it should have lower weight. For 'A' and 'C', During the cell propagation, '0' differentiated to 'A' and '0' differentiated to 'C'. Thus it should have larger weight. We can assign weights as below equation.

drawing

And we can approximate unknown weights using training data.

drawing DCLEAR also implements a k-mer replacement distance (KRD), which does not require training data. KRD method first looks at mutations in the character arrays to estimate the parameters of the generative process associated with the tree to be reconstructed. With these parameters, we repetitively simulated trees with a size and mutation distribution similar to the target tree. The k-mer replacement distances were estimated from the simulated lineage trees and used to compute the distances between input sequences in the character arrays of internal nodes and tips. Specifically, by examining the simulated lineage trees, KRD estimated the expected 1-mer replacement distance between characters in the array (including ground state “0” and deletion state “-“) in the lineage trees and the probability for a given nodal distance of replacing a character in a cell array. To extend the 1-mer replacement distance to the k-mer replacement distance, the posterior probability distributions of k-mer replacement distance were estimated by using a conditional model considering a dependance for the concurrence of mutations. We found that by considering the neighboring characters, the conditional model can more accurately estimate the nodal distance than an independent model. The cell distance can then be readily computed as the mean expected k-mer replacement distance.

With the previously proposed distance matric, we can construct distance matrix among cells. We can apply tree construction algorithms such as Neighbor-Joining(NJ), FastME.

How to use weighted hamming : Link
How to use kmer_replacement : Colab Link
Preparation for subchallenge 2 submission : Colab Link
Preparation for subchallenge 3 submission : link

With 'devtools':

devtools::install_github("ikwak2/DCLEAR")

The R/DCLEAR package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 3, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

A copy of the GNU General Public License, version 3, is available at https://www.r-project.org/Licenses/GPL-3

Our talk on the special DREAM session in RECOMB 2020 meeting (https://www.recomb2020.org/) can be found here.

Any scripts or data that you put into this service are public.

DCLEAR documentation built on Sept. 14, 2023, 9:09 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DCLEAR
Distance Based Cell Lineage Reconstruction

README.md
In DCLEAR: Distance Based Cell Lineage Reconstruction

Distance based Cell LinEAge Reconstruction (DCLEAR)

DCLEAR Overview

Estimating distance between cells

K-mer replacement distance

Constructing tree from the distance matrix

Usage

installation

License

Presentation

Try the DCLEAR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

DCLEAR Distance Based Cell Lineage Reconstruction

README.md In DCLEAR: Distance Based Cell Lineage Reconstruction

Distance based Cell LinEAge Reconstruction (DCLEAR)

DCLEAR Overview

Estimating distance between cells

K-mer replacement distance

Constructing tree from the distance matrix

Usage

installation

License

Presentation

Try the DCLEAR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

DCLEAR
Distance Based Cell Lineage Reconstruction

README.md
In DCLEAR: Distance Based Cell Lineage Reconstruction