scoreSequences: Score substrate sequences for matches to kinase Position...
In awaardenberg/KinSwingR: KinSwingR: network-based kinase activity prediction

Description Usage Arguments Value Examples

Scores each input sequence for a match against all PWMs provided from buildPWM() and generates p-values for scores. The output of this function is to be used for building the swing metric, the predicted activity of kinases.

1
2
3

scoreSequences(input_data = NULL, pwm_in = NULL,
  background = "random", n = 1000, force_trim = FALSE,
  verbose = FALSE)

`input_data`	A data.frame of phoshopeptide data. Must contain 4 columns and the following format must be adhered to. Column 1 - Annotation, Column 2 - centered peptide sequence, Column 3 - Fold Change [-ve to +ve], Column 4 - p-value [0-1]
`pwm_in`	List of PWMs created using buildPWM()
`background`	Option to provide a data.frame of peptides to use as background. If providing a background as a table, this must contain two columns; Column 1 - Annotation, Column 2 - centered peptide sequence. These must be centered. OR generate a random background for PWM scoring from the input list - background = random. Default: "random"
`n`	Number of permutations to perform for generating background. Default: "1000"
`force_trim`	This function will detect if a peptide sequence is of different length to the PWM models generated (provided in pwm_in) and trim the input sequences to the same length as the PWM models. If a background is provided, this will also be trimmed to the same width as the PWM models. Options are: "TRUE, FALSE". Default = FALSE
`verbose`	Turn verbosity on/off. To turn on, verbose=TRUE. Options are: "TRUE, FALSE". Default = FALSE

A list with 3 elements: 1) PWM-substrate scores: substrate_scores$peptide_scores, 2) PWM-substrate p-values: substrate_scores$peptide_p 3) Background used for reproducibility: substrate_scores$background 4) input_data is returned in the case that it was trimmed.

## import data
data(example_phosphoproteome)
data(phosphositeplus_human)

## clean up the annotations
## sample 100 data points for demonstration
sample_data <- head(example_phosphoproteome, 100)
annotated_data <- cleanAnnotation(input_data = sample_data)

## build the PWM models:
set.seed(1234)
sample_pwm <- phosphositeplus_human[sample(nrow(phosphositeplus_human), 
1000),]
pwms <- buildPWM(sample_pwm)

## score the PWM - substrate matches
## Using a "random" background, to calculate the p-value of the matches
## Using n=10 for demonstration
## set.seed for reproducibility
set.seed(1234)
substrate_scores <- scoreSequences(input_data = annotated_data,
                                   pwm_in = pwms,
                                   background = "random",
                                   n = 10)