scelestial: Infer the single-cell phylogenetic tree

View source: R/scelestial.R

scelestialR Documentation

Infer the single-cell phylogenetic tree

Description

Performs the Scelestial algorithm and calculates the phylogenetic tree reconstruction based on an approximation algorithm for Steiner tree problem.

Usage

scelestial(
  seq,
  mink = 3,
  maxk = 3,
  root.assign.method = c("none", "balance", "fix"),
  root = NULL,
  return.graph = FALSE
)

Arguments

seq

The sequence matrix. Rows represent loci and columns represent samples. Elements of the matrix represent 10-state genome sequencing results, or missing values. I.e each element is in the format "X/Y" where X and Y are from the set {A, T, C, G}. There is a special case "./." that represents the missing value.

mink

The minimum k used in the calculation of k-restricted Steiner trees. It is supposed to be 3.

maxk

The maximum k used in the calculation of k-restricted Steiner trees. When maxk=3, the approximation algorithm produces an 11/6-approximation result. Increasing k increases the running time as well as the approximation ratio of the algorithm. maxk should be not less than mink.

root.assign.method, root

root.assign.method is the method for choosing the root.

  • "none" for undirected tree,

  • "fix" for a tree with root as its root.

  • "balance" to let the root to be chosen to produce the most balanced tree.

return.graph

If TRUE, the actual graph through igraph library is generated and produced.

Value

Returns a list containing following elements:

  • tree: A data frame representing edges of the tree. tree$src is the source of the edge, tree$dest represents the destination of the edge, and tree$len represents its weight (evolutionary distance).

  • input: input sequences.

  • sequence: inferred or imputed sequences for the tree nodes. If the node is already in the input, sequence represents its missing value imputation, in the case of presence of missing values, and if the node is not an input node, the sequence represents inferred sequence for the tree node.

  • graph: graph. If the return.graph is TRUE, there is an element G that represents the graph from the igraph library.

Examples

## simulates tumor evolution
S = synthesis(10, 10, 2, seed=7)
## convert to 10-state matrix
seq = as.ten.state.matrix(S$seqeunce)
## runs the scelestial to generate 4-restricted Steiner trees. It represents the tree and graph
SP = scelestial(seq, mink=3, maxk=4, return.graph = TRUE)
SP
## Expected output: 
# $input
#    node   sequence
# 1     0 AAXACAAXXA
# 2     1 AXXXAXAAXA
# 3     2 AXAXCAXXAX
# 4     3 AXCCCAXAAX
# 5     4 AXCXAXXCAX
# 6     5 XXCAXXXXXX
# 7     6 XACXACAAAC
# 8     7 AXAXXAXAXA
# 9     8 AXAAXXAXXX
# 10    9 AAXXXXCXCX
#
# $sequence
#    node   sequence
# 1     0 AAAACAAACA
# 2     1 AACAAAAAAA
# 3     2 AAAACAAAAA
# 4     3 AACCCAAAAA
# 5     4 AACAACACAC
# 6     5 AACAACAAAC
# 7     6 AACAACAAAC
# 8     7 AAAACAAACA
# 9     8 AAAACAAACA
# 10    9 AAAACACACA
# 11   10 AAAACAAACA
# 12   16 AACAAAAAAA
# 13   18 AACACAAAAA
#
# $tree
#    src dest     len
# 1    9   10 4.00006
# 2    8   10 3.00006
# 3    7   10 2.50005
# 4    0   10 1.50003
# 5    6   16 3.00002
# 6    1   16 2.50005
# 7    3   18 2.50003
# 8    0   18 1.50003
# 9   16   18 1.00000
# 10   0    2 3.50008
# 11   4    6 4.00007
# 12   5    6 4.50010
#
# $graph
# IGRAPH 6ba60f3 DNW- 13 12 --
# + attr: name (v/c), weight (e/n)
# + edges from 6ba60f3 (vertex names):
#  [1] 9 ->10 8 ->10 7 ->10 0 ->10 6 ->16 1 ->16 3 ->18 0 ->18 16->18 0 ->2
# [11] 4 ->6  5 ->6
#

RScelestial documentation built on May 29, 2024, 9:41 a.m.