cdhit_ccdb: Use 'cdhit()' to cluster a 'ContigCellDB()'

Description Usage Arguments Value See Also Examples

View source: R/cdhit-methods.R

Description

Use cdhit() to cluster a ContigCellDB()

Usage

1
2
3
4
5
6
7
cdhit_ccdb(
  ccdb,
  sequence_key,
  type = c("DNA", "AA"),
  cluster_pk = "cluster_idx",
  ...
)

Arguments

ccdb

An object of class ContigCellDB()

sequence_key

character naming the column in the contig_tbl containing the sequence to be clustered

type

one of 'DNA' or 'AA'

cluster_pk

character specifying key, and name for the clustering.

...

Arguments passed on to cdhit

identity

minimum proportion identity

kmerSize

word size. If NULL, it will be chosen automatically based on the identity. You may need to lower it below 5 for AAseq with identity less than .7.

min_length

Minimum length for sequences to be clustered. An error if something smaller is passed.

s

fraction of shorter sequence covered by alignment.

showProgress

show a status bar

Value

ContigCellDB()

See Also

cdhit()

Examples

1
2
3
4
5
6
data(ccdb_ex)
res = cdhit_ccdb(ccdb_ex, 'cdr3_nt', type = 'DNA',
cluster_name = 'DNA97', identity = .965, min_length = 12, G = 1)
res$cluster_tbl
res$contig_tbl
res$cluster_pk

CellaRepertorium documentation built on Nov. 8, 2020, 7:48 p.m.