View source: R/predictDomains.R
predictDomains | R Documentation |
Predict protein domain families from coding transcripts
predictDomains(x, fasta, ..., plot = FALSE, progress_bar = FALSE, ncores = 4)
x |
Can be a GRanges object containing 'CDS' features in GTF format Can be a GRangesList object containing CDS ranges for each transcript |
fasta |
BSgenome or Biostrings object containing genomic sequence |
... |
Logical conditions to pass to dplyr::filter to subset transcripts for analysis. Variables are metadata information found in 'x' and multiple conditions can be provided delimited by comma. Example: transcript_id == "transcript1" |
plot |
Argument whether to plot out protein domains (Default: FALSE). Note: only first 20 proteins will be plotted |
progress_bar |
Argument whether to show progress bar (Default: FALSE). Useful to track progress of predicting a long list of proteins. |
ncores |
Number of cores to utilise to perform prediction |
Dataframe containing protein features for each cds entry
Fursham Hamid
## ---------------------------------------------------------------------
## EXAMPLE USING SAMPLE DATASET
## ---------------------------------------------------------------------
# Load Mouse genome sequence
library(BSgenome.Mmusculus.UCSC.mm10)
# Load dataset
data(new_query_gtf)
# predict domains of all CDSs in query GTF
predictDomains(new_query_gtf, Mmusculus, ncores=1)
# predict domains of CDSs from Ptbp1 gene
predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1",ncores=1)
# predict domains of CDSs from Ptbp1 gene and plot architecture out
predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1", plot = TRUE,ncores=1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.