kcount: Count kgrams

kcountR Documentation

Count kgrams

Description

kcount counts given kgrams from given sequences. Kgrams can be specified by character vector or by K=N.

Usage

kcount(
  seq = NULL,
  fafile = NULL,
  grams = NULL,
  k = NULL,
  kfile = NULL,
  mergeFile = FALSE,
  from = NA,
  to = NA,
  sort = TRUE,
  topn = 50,
  perc = TRUE
)

Arguments

seq

a DNAStringSet object.

fafile

fasta file name(s). fafile or seq can only set one. If fafile=c(A='fA1.fa',B='fB2.fa'), then the output depends on mergeFile=TRUE/FALSE. And the column names are set as A for count, A_perc for percent. If names(fafile) is not provided, then output columns will be A1_perc, B2_perc.

grams

specify the grams to count. Value can be c('aataa',...) or 'V1' (means AATAAA and its 1nt variants) or mouse/mm/mm10 (means PAS of mouse).

k

specify k grams. E.g., k=6 means all hexamers.

kfile

specify a kgram file, with each line being a kgram. The file does not contain header line. Only one parameter of grams/k/kfile can be provided.

mergeFile

if TRUE, then when multiple file names are given by fafile, files are combined first.

from

from and to defines the range of pA to search kgrams.

to

from and to defines the range of pA to search kgrams.

sort

TRUE/FALSE, whether to sort kgrams by their counts.

topn

whether to output topn grams.

perc

whether also to output percentage.

Value

A data frame with columns [grams, count, <perc>] or [grams, file1,..., fileN, ..., file1_perc, ..., fileN_perc]

See Also

Other APA signal functions: annotateByPAS(), faFromPACds(), getVarGrams(), plotATCGforFAfile(), plotSeqLogo()

Examples

kcount(seq=seq, grams=c('AATAAA','ATTAAA'))
kcount(seq=seq, k=3)
kcount(seq=seq, k=6, from=260, to=290)
kcount(fafile='updn.fa', k=6, from=260, to=290)
kcount(fafile='updn.fa', grams='v1', from=260, to=290)
kcount(fafile='updn.fa', grams='mouse', from=240, to=310)
kcount(fafile=c('updn.fa', 'updn2.fa'), grams='mouse', from=240, to=310, mergeFile=FALSE)
kcount(fafile=c(A='updn.fa', B='updn2.fa'), grams='mouse', from=240, to=310, mergeFile=FALSE)
kcount(fafile=c('updn.fa', 'updn2.fa'), grams='mouse', from=240, to=310, mergeFile=TRUE)

BMILAB/movAPA documentation built on Jan. 3, 2024, 11:09 p.m.