Description Usage Arguments Details Value See Also Examples
For all sequences in a cluster(s) calculate the frequency of separate words in either the sequence definitions or the reported feature name.
1 2 | calc_wrdfrq(phylota, cid, min_frq = 0.1, min_nchar = 1, type = c("dfln",
"nm"), ignr_pttrn = "[^a-z0-9]")
|
phylota |
Phylota object |
cid |
Cluster ID(s) |
min_frq |
Minimum frequency |
min_nchar |
Minimum number of characters for a word |
type |
Definitions (dfln) or features (nm) |
ignr_pttrn |
Ignore pattern, REGEX for text to ignore. |
By default, anything that is not alphanumeric is ignored. 'dfln' and 'nm' match the slot names in a SeqRec, see list_seqrec_slots().
list
Other tools-public: calc_mad
,
drop_by_rank
, drop_clstrs
,
drop_sqs
, get_clstr_slot
,
get_nsqs
, get_ntaxa
,
get_sq_slot
, get_stage_times
,
get_tx_slot
, get_txids
,
is_txid_in_clstr
,
is_txid_in_sq
,
list_clstrrec_slots
,
list_ncbi_ranks
,
list_seqrec_slots
,
list_taxrec_slots
,
plot_phylota_pa
,
plot_phylota_treemap
,
read_phylota
, write_sqs
1 2 3 4 5 6 7 | data('dragonflies')
# work out what gene region the cluster is likely representing with word freqs.
random_cids <- sample(dragonflies@cids, 10)
# most frequent words in definition line
(calc_wrdfrq(phylota = dragonflies, cid = random_cids, type = 'dfln'))
# most frequent words in feature name
(calc_wrdfrq(phylota = dragonflies, cid = random_cids, type = 'nm'))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.