import_substrates: Import and annotate enzyme substrate preference files

Description Usage Arguments Examples

View source: R/data_import.R

Description

This function imports enzymatic preference data sheets and creates an annotated preference data.table through a combination of parsing file names and column data.

The phosphoproteomics file name should follow this format: 'KINASE_CONDITION_REPLICATE_*.csv'. 1. 'KINASE': The abbreviation for the kinase used in the enzymatic reaction. 2. 'CONDITION': Accepts two values, 'PLUS' indicates an enzyme-treated sample and 'MINUS' indicates an untreated (i.e. negative control) sample. 3. 'REPLICATE': Should follow the format of 'R*' where '*' is a wildcard representing the replicate number (numeric). Any additional annotation can be added with an '_' following the replicate number and will be ignored for file processing.

**Important**: Legacy file names must *also* contain 'Substrate' for substrate files and 'SBF' or 'FREQ' for substrate background frequency files.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import_substrates(
  path,
  ptm_type = c("Oxidation (M)-M", "Carbamidomethylation-C", "Deamidation (NQ)-N",
    "Deamidation (NQ)-Q", "Phosphorylation (STY)-Y", "Phosphorylation (STY)-T",
    "Phosphorylation (STY)-S", "Acetylation (Protein N-term)-E",
    "Acetylation (Protein N-term)-D", "Acetylation (Protein N-term)-M",
    "Acetylation (Protein N-term)-S", "Acetylation (Protein N-term)-A",
    "Acetylation (Protein N-term)-T", "Acetylation (Protein N-term)-G",
    "Acetylation (Protein N-term)-C", "Acetylation (Protein N-term)-V",
    "Pyro-glu from Q-Q"),
  legacy = FALSE,
  freq = FALSE,
  ref_col = NULL
)

Arguments

path

The directory containing the analyzed phosphoproteomics data

ptm_type

The post-translational modification to be analyzed and its targeted the amino acid. Currently, only one ptm type can be analyzed at a time. A list of ptm options can be accessed by typing 'ptm_key'

legacy

A logical parameter where TRUE is for files generated using the Galaxy-P KinaMine workflow. Is set to FALSE by default.

freq

A logical parameter where TRUE indicates substrate frequency across technical replicates should be recorded. Is set to TRUE by default.

ref_col

Only applicable to legacy files. A string with the name of the column containing the uniprot identifier information. If none is specified, it will search for a Reference column automatically.

Examples

1
2
3
    path <- system.file("extdata", package = "KINATESTID")
    substrates <- import_substrates(path, ptm_type = "Phosphorylation (STY)-Y",
                                    legacy =  FALSE, freq = TRUE)

edpratt1/KINATESTID documentation built on Feb. 5, 2022, 1:21 p.m.