ngraMatrix: Compute n-Gram Frequencies Dataframe

Description Usage Arguments Details Value References See Also Examples

View source: R/ngram.R

Description

Computes the n-gram frequencies dataframe for the protein and species provides.

Usage

1
ngraMatrix(data, k = 4, silent = FALSE)

Arguments

data

a dataframe with as many columns as species and one row per orthologous protein. The rows and columns must be named accordingly.

k

a positive integer, between 1 and 5, indicating the k-mer of the words to be counted.

silent

logical, set to FALSE to avoid loneliness.

Details

The argument prot can be obtained using orth() and orth.seq().

Value

A list with two dataframes. The first one with nsp * npr columns (nsp: number of species, npr: number of proteins per species) and npe rows (npe: number of peptides, 20 for n = 1, 400 for n = 2, 8000 for n = 3 and 160000 for n = 4). The entries of the dataframe are the number of times that the indicated peptide has been counted in the given protein. Orthologous proteins are in consecutive columns, thus the first nsp columns are the orthologous of protein 1 and so on. The second dataframe contains the Species Vector Sums (each vector describes one species).

References

Stuart et al. Bioinformatics 2002; 18:100-108.

See Also

ngram(), svdgram()

Examples

1
ngraMatrix(bovids[,1:3], k = 2)

EnvNJ documentation built on Sept. 27, 2021, 5:07 p.m.

Related to ngraMatrix in EnvNJ...