buildPWM: Generate Position Weight Matrices (PWMs)

Description Usage Arguments Value Examples

View source: R/generate_pwms.r

Description

Generate Position Weight Matrices (PWMs) for a table containing centered substrate peptide sequences for a list of kinases. The output of this function is to be used for scoring PWM matches to peptides via scoreSequences()

Usage

1
2
3
buildPWM(kinase_table = NULL, wild_card = "_", substrate_length = 15,
  substrates_n = 10, pseudo = 0.01, remove_center = FALSE,
  verbose = FALSE)

Arguments

kinase_table

A data.frame of substrate sequences and kinase names. Format of data must be as follows: column 1 - kinase/kinase family name/GeneID, column 2 - centered peptide seqeuence.

wild_card

Letter to describe sequences that are outside of the protein after centering on the phosphosite (e.g ___MERSTRELCLNF). Default: "_".

substrate_length

Full length of substrate sequence (default is 15). Will be trimmed automatically or report error if sequences in kinase_table are not long enough.

substrates_n

Number of sequences used to build a PWM model. Low sequence counts will produce poor representative PWM models. Default: "10"

pseudo

Small number to add to values for PWM log transformation to prevent log transformation of zero. Default = 0.01

remove_center

Remove all peptide seqeuences with the central amino acid matching a character (e.g. "y"). Default = FALSE

verbose

Print progress to screen. Default=FALSE

Value

Output is a list containing two tables, "pwm" and "kinase". To access PWMs: pwms$pwm and Table of Kinase and sequence counts: pwms$kinase

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Build PWM models from phosphositeplus data with default of minimum
## of 10 substrate sequences for building a PWM model.

data(phosphositeplus_human)

##randomly sample 1000 substrates for demonstration.
set.seed(1)
sample_pwm <- phosphositeplus_human[sample(nrow(phosphositeplus_human), 
1000),]
pwms <- buildPWM(sample_pwm)

## Data frame of models built and number of sequences used to build each
## PWM model:
head(pwms$kinase)

KinSwingR documentation built on Nov. 8, 2020, 6:30 p.m.