dca: 'dca' Direct Coupling Analysis (DCA) for amino acid and codon...

Description Usage Arguments Details Value See Also Examples

View source: R/DCA.R

Description

dca Direct Coupling Analysis (DCA) for amino acid and codon sequences

Usage

1
2
3
4
5
6
7
8
9
dca(
  inputfile,
  pseudocount_weight = 0.5,
  theta = 0.2,
  seqid_or_excluded_indices = 1,
  nuc = F,
  fileType = "fasta",
  outputfile = NULL
)

Arguments

inputfile

file containing the alignment in FASTA format

pseudocount_weight

relative weight of pseudo count

theta

threshold for sequence id in reweighting

seqid_or_excluded_indices

Three options: 1) A single numeric value indicates the sequence index in the MSA to use as a reference (1 is the default - An MSA column is either all dot+lower case or dash+upper case, by very definition of the output of HMMer. 2) A string indicates the name or id of the sequence in the msa to use as a reference 3) A vector of intergers indicates the columns to exclude from analysis (no use of reference sequence in this case).

nuc

TRUE for codon based analysis and FALSE (default) for amino acids based analysis.

fileType

input file type

outputfile

output file name to wrie results

Details

SOME RELEVANT VARIABLES RETURNED: N number of residues in each sequence (no insert) M number of sequences in the alignment Meff effective number of sequences after reweighting q equal to 21 (20 aminoacids + 1 gap) align M x N matrix containing the alignmnent Pij_true N x N x q x q matrix containing the reweigthed frequency counts. Pij N x N x q x q matrix containing the reweighted frequency counts with pseudo counts. C N(q-1) x N(q-1) matrix containing the covariance matrix.

Value

a list containing data produced in this function and the dca results. The results is composed by N(N-1)/2 (N = length of the sequences) rows and 4 columns: residue i (column 1), residue j (column 2), MI(i,j) (Mutual Information between i and j), and DI(i,j) (Direct Information between i and j). Note: all insert columns are removed from the alignment.

See Also

https://github.com/etaijacob/AA2CODON if you want to generate the input files

Examples

1
cma.res <- dca(codon_file_name, seqid = 1, nuc = T, fileType = "fasta")

etaijacob/CMA documentation built on Dec. 27, 2019, 4:17 p.m.