build_fastas: Create a FASTA file with sequence fragments
In casblaauw/phosphocie: Create Continuous Phosphosite Colour Representations

build_fastas

R Documentation

Create a FASTA file with sequence fragments

Description

From (potential) sites and their surrounding amino acids, create a FASTA-conforming file. Requires a column of unique header values for each site.

Usage

build_fastas(
  data,
  path,
  name_col,
  seq_col,
  header_pattern = "{.data[[name_col]]}|{get_middle_fragment(.data[[seq_col]], 7)}"
)

Arguments

`data`	Data frame in long format, containing unique names and sequence windows.
`path`	Path to write the file to, including file name and extension.
`name_col`	Name of the column that contains metadata about the sequence. Used as a part of header_pattern by default If missing, uses header_pattern to construct unique IDs.
`seq_col`	Name of the column that contains the sequence windows.
`header_pattern`	Pattern for use in `glue::glue()`, used to construct a new unique header column based on current columns. Default uses name_col and appends 5 aa window around site for unique identification by `read_netphorest()`. Any whitespace will be replaced by underscores. If set to NULL, `header_pattern` will just be 'name_col'.

Value

Returns the input data with new headers included as fasta_id, invisibly.

Examples

kinsub_path <- system.file('extdata', 'kinase_substrate_dataset_head', package = 'phosphocie')
kinsub <- read_kinsub(kinsub_path)
tmp <- tempfile()

build_fastas(kinsub, tmp, name_col = 'unique_id', seq_col = 'fragment_15')

build_fastas(kinsub, tmp, seq_col = 'fragment_15', header_pattern = '{acc_id}|{gene}|{substrate}|{residue}{position}')

casblaauw/phosphocie documentation built on March 30, 2022, 8:28 p.m.