find_motifs: Finding motifs in sequences

View source: R/find_motifs.R

find_motifsR Documentation

Finding motifs in sequences

Description

Finding motifs in sequences. This function searches and simultaneously counts continuous and discontinuous motifs in a sequence vector. It is used in the turbo_gliph function to identify local similarities.

Usage

find_motifs(seqs, q = 2:4, kmer_mindepth = NULL, discontinuous = FALSE)

Arguments

seqs

character vector. This vector must contain the sequences whose motifs are to be identified and quantified.

q

accepts a numeric vector of motif lengths you want to find. By default it searches for motifs of size 2, 3 and 4.

kmer_mindepth

numeric. By default 3. Minimum observations of kmer for it to be evaluated. This is the minimum number of times a kmer should be observed in the sample set in order for it to be considered for being returned.

discontinuous

logical. By default FALSE. Determines whether discontinuous motifs are to be considered.

Value

find_motifs returns a data frame with two columns. The first column contains the motifs and the second column the frequency of the motifs.

Examples

utils::data("gliph_input_data")
sample_seqs <- base::as.character(gliph_input_data$CDR3b)
res <- find_motifs(seqs = sample_seqs)


HetzDra/turboGliph documentation built on Oct. 2, 2022, 2:22 a.m.