CreatePssm: Compute profile scoring matrices for training data

Description Usage Arguments Value Author(s) References Examples

Description

Function used to compute new profile scoring matrices in the event that the user wishes to retrain SCORER 2.0 with his own training data

Usage

1
CreatePssm(training.data, var)

Arguments

training.data

A dataframe or matrix with three columns containing the information of n coiled-coil sequences. The three columns must be named "sequence", "register" and "type". The order of the columns in the dataframe does not matter

  1. column "type": contains the known oligomeric state of the coiled-coil sequences in the training data. Acceptable oligomeric states are "DIMER" and "TRIMER" only.

  2. column "sequence": contains the amino-acid sequences of the coiled coils in the training data. Valid characters are all uppercase letters except ‘B’, ‘J’, ‘O’, ‘U’, ‘X’, and ‘Z’; invalid characters will not be tolerated and their use will result in a failure of the program.

  3. Contains the register assignments specific to each coiled-coil sequence in the training data. As such, it must always have the same length as the matching amino-acid sequence in the "sequence" column. Valid characters are the lowercase letters ‘a’ to ‘g’ only. Register assignments are not required to be in proper order and may start with any of the seven letters.

var

A list of two elements containing all valid amino-acid and register characters.

Value

returns a profile scoring matrix derived from inputted training data

Author(s)

Thomas L. Vincent [email protected]

References

Craig T. Armstrong, Thomas L. Vincent, Peter J. Green and Dek N. Woolfson. (2011) SCORER 2.0: an algortihm for distinguishing parallel dimeric and trimeric coiled-coil sequences. Bioinformatics. DOI: 10.1093/bioinformatics/btr299

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# load training data
data(training)

# define allowed amino and register characters
var <- list(
    amino = c("A","C","D","E","F","G","H","I","K","L",
    "M","N","P","Q","R","S","T","V","W","Y","X"),
    register = letters[1:7])

# create profile scoring matrix
pssm <- CreatePssm(training, var)
      

SCORER2 documentation built on May 2, 2019, 4:06 a.m.