Description Usage Arguments Value Examples
View source: R/generate_pwms.r
Generate Position Weight Matrices (PWMs) for a table containing centered substrate peptide sequences for a list of kinases. The output of this function is to be used for scoring PWM matches to peptides via scoreSequences()
1 2 3 |
kinase_table |
A data.frame of substrate sequences and kinase names. Format of data must be as follows: column 1 - kinase/kinase family name/GeneID, column 2 - centered peptide seqeuence. |
wild_card |
Letter to describe sequences that are outside of the protein after centering on the phosphosite (e.g ___MERSTRELCLNF). Default: "_". |
substrate_length |
Full length of substrate sequence (default is 15). Will be trimmed automatically or report error if sequences in kinase_table are not long enough. |
substrates_n |
Number of sequences used to build a PWM model. Low sequence counts will produce poor representative PWM models. Default: "10" |
pseudo |
Small number to add to values for PWM log transformation to prevent log transformation of zero. Default = 0.01 |
remove_center |
Remove all peptide seqeuences with the central amino acid matching a character (e.g. "y"). Default = FALSE |
verbose |
Print progress to screen. Default=FALSE |
Output is a list containing two tables, "pwm" and "kinase". To access PWMs: pwms$pwm and Table of Kinase and sequence counts: pwms$kinase
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ## Build PWM models from phosphositeplus data with default of minimum
## of 10 substrate sequences for building a PWM model.
data(phosphositeplus_human)
##randomly sample 1000 substrates for demonstration.
set.seed(1)
sample_pwm <- phosphositeplus_human[sample(nrow(phosphositeplus_human),
1000),]
pwms <- buildPWM(sample_pwm)
## Data frame of models built and number of sequences used to build each
## PWM model:
head(pwms$kinase)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.