predictDomains: Predict protein domain families from coding transcripts

View source: R/predictDomains.R

predictDomainsR Documentation

Predict protein domain families from coding transcripts

Description

Predict protein domain families from coding transcripts

Usage

predictDomains(x, fasta, ..., plot = FALSE, progress_bar = FALSE, ncores = 4)

Arguments

x

Can be a GRanges object containing 'CDS' features in GTF format

Can be a GRangesList object containing CDS ranges for each transcript

fasta

BSgenome or Biostrings object containing genomic sequence

...

Logical conditions to pass to dplyr::filter to subset transcripts for analysis. Variables are metadata information found in 'x' and multiple conditions can be provided delimited by comma. Example: transcript_id == "transcript1"

plot

Argument whether to plot out protein domains (Default: FALSE). Note: only first 20 proteins will be plotted

progress_bar

Argument whether to show progress bar (Default: FALSE). Useful to track progress of predicting a long list of proteins.

ncores

Number of cores to utilise to perform prediction

Value

Dataframe containing protein features for each cds entry

Author(s)

Fursham Hamid

Examples

## ---------------------------------------------------------------------
## EXAMPLE USING SAMPLE DATASET
## ---------------------------------------------------------------------
# Load Mouse genome sequence
library(BSgenome.Mmusculus.UCSC.mm10)

# Load dataset
data(new_query_gtf)

# predict domains of all CDSs in query GTF
predictDomains(new_query_gtf, Mmusculus, ncores=1)

# predict domains of CDSs from Ptbp1 gene
predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1",ncores=1)

# predict domains of CDSs from Ptbp1 gene and plot architecture out
predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1", plot = TRUE,ncores=1)

fursham-h/factR documentation built on Aug. 20, 2023, 1:58 p.m.