biogram: N-Gram Analysis of Biological Sequences

Share:

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

Author
Michal Burdukiewicz [cre, aut], Piotr Sobczyk [aut], Chris Lauber [aut]
Date of publication
2016-09-21 12:21:03
Maintainer
Michal Burdukiewicz <michalburdukiewicz@gmail.com>
License
GPL-3
Version
1.3
URLs

View on CRAN

Man pages

aaprop
Normalized amino acids properties
add_1grams
Add 1-grams
as.data.frame.feature_test
Coerce feature_test object to a data frame
binarize
Binarize
biogram-package
biogram - analysis of biological sequences using n-grams
calc_criterion
Calculate value of criterion
calc_cs
Calculate Chi-squared-based measure
calc_ed
Calculate encoding distance
calc_ig
Calculate IG for single feature
calc_kl
Calculate KL divergence of features
calc_si
Compute similarity index
cluster_reg_exp
Clustering of sequences based on regular expression
code_ngrams
Code n-grams
construct_ngrams
Construct and filter n-grams
count_multigrams
Detect and count multiple n-grams in sequences
count_ngrams
Count n-grams in sequences
count_specified
Count specified n-grams
count_total
Count total number of n-grams
create_feature_target
Create feature according to given contingency matrix
create_ngrams
Get all possible n-Grams
criterion_distribution
criterion_distribution class
cut.feature_test
Categorize tested features
decode_ngrams
Decode n-grams
degenerate
Degenerate protein sequence
distr_crit
Compute criterion distribution
encoding2df
Convert encoding to data frame
fast_crosstable
Very fast 2d cross-tabulation
feature_test
feature_test class
gap_ngrams
Gap n-grams
get_ngrams_ind
Get indices of n-grams
human_cleave
Human signal peptides cleavage sites
is_ngram
Validate n-gram
l2n
Convert letters to numbers
list2matrix
Convert list of sequences to matrix
n2l
Convert numbers to letters
ngrams2df
n-grams to data frame
plot.criterion_distribution
Plot criterion distribution
position_ngrams
Position n-grams
print.feature_test
Print tested features
seq2ngrams
Extract n-grams from sequence
summary.feature_test
Summarize tested features
table_ngrams
Tabulate n-grams
test_features
Permutation test for feature selection
validate_encoding
Validate encoding

Files in this package

biogram
biogram/inst
biogram/inst/CITATION
biogram/inst/doc
biogram/inst/doc/overview.R
biogram/inst/doc/overview.html
biogram/inst/doc/overview.Rmd
biogram/tests
biogram/tests/testthat
biogram/tests/testthat/test_create_ngrams.R
biogram/tests/testthat/test_table_ngrams.R
biogram/tests/testthat/test_crosstable.R
biogram/tests/testthat/test_position_ngrams.R
biogram/tests/testthat/test_seq2grams.R
biogram/tests/testthat/test_is_ngram.R
biogram/tests/testthat/test_quipt_consistency.R
biogram/tests/testthat/test_count_ngrams.R
biogram/tests/testthat/test_calc_ed.R
biogram/tests/testthat/test_get_ngrams_pos.R
biogram/tests/test-all.R
biogram/NAMESPACE
biogram/CHANGELOG
biogram/data
biogram/data/human_cleave.rda
biogram/data/aaprop.rda
biogram/R
biogram/R/count_ngrams.R
biogram/R/indices_and_positions.R
biogram/R/position_ngrams.R
biogram/R/test_features.R
biogram/R/calc_ed.R
biogram/R/human_cleave.R
biogram/R/table_ngrams.R
biogram/R/information_gain.R
biogram/R/seq2matrix.R
biogram/R/count_specified.R
biogram/R/utilities.R
biogram/R/aaprop.R
biogram/R/ngram_coding.R
biogram/R/criterion_distribution.R
biogram/R/kl_divergence.R
biogram/R/feature_test_class.R
biogram/R/construct_ngrams.R
biogram/R/criterions.R
biogram/R/biogram.R
biogram/R/distr_crit.R
biogram/R/add_remove_ngrams.R
biogram/R/calc_si.R
biogram/R/cluster_reg_exp.R
biogram/R/count_multigrams.R
biogram/R/data_manipulation.R
biogram/R/is_ngram.R
biogram/R/chi_square.R
biogram/R/degenerate.R
biogram/R/ngrams.R
biogram/vignettes
biogram/vignettes/overview.Rmd
biogram/README.md
biogram/MD5
biogram/build
biogram/build/vignette.rds
biogram/DESCRIPTION
biogram/man
biogram/man/calc_ed.Rd
biogram/man/create_feature_target.Rd
biogram/man/list2matrix.Rd
biogram/man/table_ngrams.Rd
biogram/man/get_ngrams_ind.Rd
biogram/man/seq2ngrams.Rd
biogram/man/calc_ig.Rd
biogram/man/cluster_reg_exp.Rd
biogram/man/calc_cs.Rd
biogram/man/degenerate.Rd
biogram/man/count_total.Rd
biogram/man/count_multigrams.Rd
biogram/man/plot.criterion_distribution.Rd
biogram/man/as.data.frame.feature_test.Rd
biogram/man/calc_criterion.Rd
biogram/man/binarize.Rd
biogram/man/position_ngrams.Rd
biogram/man/summary.feature_test.Rd
biogram/man/is_ngram.Rd
biogram/man/calc_si.Rd
biogram/man/construct_ngrams.Rd
biogram/man/fast_crosstable.Rd
biogram/man/cut.feature_test.Rd
biogram/man/count_specified.Rd
biogram/man/human_cleave.Rd
biogram/man/create_ngrams.Rd
biogram/man/calc_kl.Rd
biogram/man/gap_ngrams.Rd
biogram/man/code_ngrams.Rd
biogram/man/criterion_distribution.Rd
biogram/man/validate_encoding.Rd
biogram/man/count_ngrams.Rd
biogram/man/l2n.Rd
biogram/man/print.feature_test.Rd
biogram/man/encoding2df.Rd
biogram/man/test_features.Rd
biogram/man/decode_ngrams.Rd
biogram/man/feature_test.Rd
biogram/man/biogram-package.Rd
biogram/man/add_1grams.Rd
biogram/man/ngrams2df.Rd
biogram/man/n2l.Rd
biogram/man/aaprop.Rd
biogram/man/distr_crit.Rd