gtf2longest | R Documentation |
This function extracts the gene position from GTF input and optional extracts the longest isoform.
gtf2longest(gtffile, cds = NULL, removeNonCoding = TRUE, source = "NCBI")
gtffile |
|
cds |
|
removeNonCoding |
specify if NonCoding transcripts should be removed |
source |
source indicating either NCBI or ENSEMBL [default: NCBI] |
list
Kristian K Ullrich
XStringSet-class
## Not run:
## load example sequence data
## set NCBI GTF URL
NCBI <- "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/"
ARATHA.NCBI.gtf.url <- paste0(NCBI,
"GCF/000/001/735/GCF_000001735.4_TAIR10.1/",
"GCF_000001735.4_TAIR10.1_genomic.gtf.gz")
ARATHA.NCBI.gtf.file <- tempfile()
## download GTF file
download.file(ARATHA.NCBI.gtf.url, ARATHA.NCBI.gtf.file, quiet=FALSE)
## set NCBI CDS URL
ARATHA.NCBI.cds.url <- paste0(NCBI,
"GCF/000/001/735/GCF_000001735.4_TAIR10.1/",
"GCF_000001735.4_TAIR10.1_cds_from_genomic.fna.gz")
ARATHA.NCBI.cds.file <- tempfile()
## download CDS file
download.file(ARATHA.NCBI.cds.url, ARATHA.NCBI.cds.file, quiet=FALSE)
## load CDS
ARATHA.NCBI.cds <- Biostrings::readDNAStringSet(ARATHA.NCBI.cds.file)
## get genepos and longest isoform
ARATHA.NCBI.gtf.longest <- gtf2longest(gtffile=ARATHA.NCBI.gtf.file,
cds=ARATHA.NCBI.cds, source="NCBI")
ARATHA.NCBI.gtf.longest$genepos
ARATHA.NCBI.gtf.longest$cds
## set ENSEMBL GTF URL
ensembl <- "http://ftp.ensemblgenomes.org/pub/plants/release-52/"
ARATHA.ENSEMBL.gtf.url <- paste0(ensembl,
"gtf/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.52.gtf.gz")
ARATHA.ENSEMBL.gtf.file <- tempfile()
## download GTF file
download.file(ARATHA.ENSEMBL.gtf.url, ARATHA.ENSEMBL.gtf.file, quiet=FALSE)
## set ENSEMBL CDS URL
ARATHA.ENSEMBL.cds.url <- paste0(ensembl,
"fasta/arabidopsis_thaliana/cds/",
"Arabidopsis_thaliana.TAIR10.cds.all.fa.gz")
ARATHA.ENSEMBL.cds.file <- tempfile()
## download CDS file
download.file(ARATHA.ENSEMBL.cds.url, ARATHA.ENSEMBL.cds.file, quiet=FALSE)
ARATHA.ENSEMBL.cds <- Biostrings::readDNAStringSet(ARATHA.ENSEMBL.cds.file)
## get genepos and longest isoform
ARATHA.ENSEMBL.gtf.longest <- gtf2longest(gtffile=ARATHA.ENSEMBL.gtf.file,
cds=ARATHA.ENSEMBL.cds)
ARATHA.ENSEMBL.gtf.longest$genepos
ARATHA.ENSEMBL.gtf.longest$cds
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.