run_phobius: Create a data frame with the membrane locations of each amino...

View source: R/run_phobius.R

run_phobiusR Documentation

Create a data frame with the membrane locations of each amino acid in a protein using Phobius

Description

This function creates a data frame with columns containing transcript IDs and corresponding output from Phobius. The Phobius output includes a location for each amino acid, with O representing extracellular, M representing transmembrane, S representing signal, and i representing intracellular.

Usage

run_phobius(AA_seq, fasta_file_name)

Arguments

AA_seq

A data frame outputted by the get_pairs function containing the gene names, transcript IDs, APPRIS annotations, and protein sequences for each transcript.

fasta_file_name

Path to fasta file containing amino acid sequences

Value

A data frame containing each transcript ID and the corresponding membrane location for each amino acid in its sequence formatted as a string

Note

In order for this function to work, there needs to be a .fasta file containing the amino acid sequences for each transcript called "AA.fasta" saved to the working directory. Additionally, the file saves a copy of the returned data frame in csv format to the working directory.

Examples

tmhmm_folder_name <- "~/TMHMM2.0c"
if (check_tmhmm_install(tmhmm_folder_name)) {
    currwd <- getwd()
    AA_seq <- get_pairs(system.file("extdata", "crb1_example.csv",
        package = "surfaltr"
    ), TRUE, "mouse", TRUE)
    topo <- run_phobius(AA_seq, paste(getwd(), "/AA.fasta", sep = ""))
    setwd(currwd)
}

EliLillyCo/surfaltr documentation built on May 3, 2022, 10:12 a.m.