biogram: N-Gram Analysis of Biological Sequences

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

AuthorMichal Burdukiewicz [cre, aut], Piotr Sobczyk [aut], Chris Lauber [aut]
Date of publication2017-01-06 01:18:55
MaintainerMichal Burdukiewicz <michalburdukiewicz@gmail.com>
LicenseGPL-3
Version1.4
https://github.com/michbur/biogram

View on CRAN

Man pages

aaprop: Normalized amino acids properties

add_1grams: Add 1-grams

as.data.frame.feature_test: Coerce feature_test object to a data frame

binarize: Binarize

biogram-package: biogram - analysis of biological sequences using n-grams

calc_criterion: Calculate value of criterion

calc_cs: Calculate Chi-squared-based measure

calc_ed: Calculate encoding distance

calc_ig: Calculate IG for single feature

calc_kl: Calculate KL divergence of features

calc_pi: Calculate partition index

calc_si: Compute similarity index

cluster_reg_exp: Clustering of sequences based on regular expression

code_ngrams: Code n-grams

construct_ngrams: Construct and filter n-grams

count_multigrams: Detect and count multiple n-grams in sequences

count_ngrams: Count n-grams in sequences

count_specified: Count specified n-grams

count_total: Count total number of n-grams

create_encoding: Create encoding

create_feature_target: Create feature according to given contingency matrix

create_ngrams: Get all possible n-Grams

criterion_distribution: criterion_distribution class

cut.feature_test: Categorize tested features

decode_ngrams: Decode n-grams

degenerate: Degenerate protein sequence

distr_crit: Compute criterion distribution

encoding2df: Convert encoding to data frame

fast_crosstable: Very fast 2d cross-tabulation

feature_test: feature_test class

gap_ngrams: Gap n-grams

get_ngrams_ind: Get indices of n-grams

human_cleave: Human signal peptides cleavage sites

is_ngram: Validate n-gram

l2n: Convert letters to numbers

list2matrix: Convert list of sequences to matrix

n2l: Convert numbers to letters

ngrams2df: n-grams to data frame

plot.criterion_distribution: Plot criterion distribution

position_ngrams: Position n-grams

print.feature_test: Print tested features

seq2ngrams: Extract n-grams from sequence

summary.feature_test: Summarize tested features

table_ngrams: Tabulate n-grams

test_features: Permutation test for feature selection

validate_encoding: Validate encoding

Files in this package

biogram
biogram/inst
biogram/inst/CITATION
biogram/inst/doc
biogram/inst/doc/overview.R
biogram/inst/doc/overview.html
biogram/inst/doc/overview.Rmd
biogram/tests
biogram/tests/testthat
biogram/tests/testthat/test_create_ngrams.R
biogram/tests/testthat/test_table_ngrams.R
biogram/tests/testthat/test_crosstable.R
biogram/tests/testthat/test_position_ngrams.R
biogram/tests/testthat/test_seq2grams.R
biogram/tests/testthat/test_is_ngram.R
biogram/tests/testthat/test_quipt_consistency.R
biogram/tests/testthat/test_count_ngrams.R
biogram/tests/testthat/test_calc_ed.R
biogram/tests/testthat/test_get_ngrams_pos.R
biogram/tests/test-all.R
biogram/NAMESPACE
biogram/CHANGELOG
biogram/data
biogram/data/human_cleave.rda
biogram/data/aaprop.rda
biogram/R
biogram/R/count_ngrams.R biogram/R/create_encoding.R biogram/R/indices_and_positions.R biogram/R/position_ngrams.R biogram/R/test_features.R biogram/R/calc_ed.R biogram/R/human_cleave.R biogram/R/table_ngrams.R biogram/R/information_gain.R biogram/R/seq2matrix.R biogram/R/count_specified.R biogram/R/utilities.R biogram/R/aaprop.R biogram/R/ngram_coding.R biogram/R/criterion_distribution.R biogram/R/kl_divergence.R biogram/R/feature_test_class.R biogram/R/construct_ngrams.R biogram/R/criterions.R biogram/R/biogram.R biogram/R/distr_crit.R biogram/R/add_remove_ngrams.R biogram/R/calc_si.R biogram/R/cluster_reg_exp.R biogram/R/count_multigrams.R biogram/R/data_manipulation.R biogram/R/is_ngram.R biogram/R/chi_square.R biogram/R/degenerate.R biogram/R/ngrams.R
biogram/vignettes
biogram/vignettes/biogram_pub.bib
biogram/vignettes/overview.Rmd
biogram/README.md
biogram/MD5
biogram/build
biogram/build/vignette.rds
biogram/DESCRIPTION
biogram/man
biogram/man/calc_ed.Rd biogram/man/create_feature_target.Rd biogram/man/create_encoding.Rd biogram/man/list2matrix.Rd biogram/man/table_ngrams.Rd biogram/man/get_ngrams_ind.Rd biogram/man/seq2ngrams.Rd biogram/man/calc_ig.Rd biogram/man/cluster_reg_exp.Rd biogram/man/calc_cs.Rd biogram/man/degenerate.Rd biogram/man/count_total.Rd biogram/man/count_multigrams.Rd biogram/man/calc_pi.Rd biogram/man/plot.criterion_distribution.Rd biogram/man/as.data.frame.feature_test.Rd biogram/man/calc_criterion.Rd biogram/man/binarize.Rd biogram/man/position_ngrams.Rd biogram/man/summary.feature_test.Rd biogram/man/is_ngram.Rd biogram/man/calc_si.Rd biogram/man/construct_ngrams.Rd biogram/man/fast_crosstable.Rd biogram/man/cut.feature_test.Rd biogram/man/count_specified.Rd biogram/man/human_cleave.Rd biogram/man/create_ngrams.Rd biogram/man/calc_kl.Rd biogram/man/gap_ngrams.Rd biogram/man/code_ngrams.Rd biogram/man/criterion_distribution.Rd biogram/man/validate_encoding.Rd biogram/man/count_ngrams.Rd biogram/man/l2n.Rd biogram/man/print.feature_test.Rd biogram/man/encoding2df.Rd biogram/man/test_features.Rd biogram/man/decode_ngrams.Rd biogram/man/feature_test.Rd biogram/man/biogram-package.Rd biogram/man/add_1grams.Rd biogram/man/ngrams2df.Rd biogram/man/n2l.Rd biogram/man/aaprop.Rd biogram/man/distr_crit.Rd

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.