locsFromgff: Get genomic gene locations from the gff file

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Get genomic genes location

Usage

1
2
3
4
5
6
7
8
9
GetLocsfgff(gffRawMat, genePrefix = character(0))

GetLocsfKEGGSpe(KEGGSpe)

GetLocsTag(annoStr)

ExtractLocs(gffRawMat)

download.SpeAnno(KEGGSpe, pattern, saveFolder)

Arguments

gffRawMat

raw gff matrix.

genePrefix

prefix to locus gene names.

KEGGSpe

A KEGG species ID

annoStr

character strings of gff annotation, which is sperated by ';'.

pattern

A character string whether "gff" or "feature_table"

saveFolder

A folder to save gff and md5sum check files. If the folder does not exist, then creat a one at first.

Details

GetLocsfgff(): get genomic gene location information from gff matrix. The whole genomic gene are divided by chromosomes and plasmids (if the organism has).

GetLocsfKEGGSpe(): get genes locus of KEGG species from NCBI gff files. It trys the RefSeq database at first; if RefSeq is not found, then changes to the database to GenBank. The prefix of locus name is the abbreviation of KEGG genomes.

ExtractLocs(): extract gene location from gff raw file.

download.SpeAnno(): download gff/feature_table and md5sum check files. If the md5sum check fails, download the files again until it passes.

Value

GetLocsfgff(): a list of genomes containing gene location information. The locus_tags is used for the gene names.

GetLocsfKEGGSpe(): a list of genomes containing gene location information. The locus_tags is used for the gene names.

a matrix. Locus_tags, old_locus_tags will also be return if provided.

ExtractLocs(): 4 or 5-column matrix, 1st (or 1st and 2ed) is the locus_tags, the last three are start position, end postion, and DNA strand.

download.SpeAnno(): download the gff, feature_table, or md5sum files.

Author(s)

Yulong Niu niuylscu@gmail.com

Yulong Niu niuylscu@gmail.com

Yulong Niu niuylscu@gmail.com

Yulong Niu niuylscu@gmail.com

Yulong Niu niuylscu@gmail.com

See Also

read.gff read in gff files

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## read in the dra (Deinococcus radiodurans R1) gff gz file in local disk
gzPath <- system.file("extdata", "dra.gff.gz", package = "ProGenome")
dragff <- read.gff(gzPath, isurl = FALSE, isgz = TRUE)

## only extract whole locus without genome division
locsRawMat <- ExtractLocs(dragff)

## whole locus divided by genomes and plasmids
locusList <- GetLocsfgff(dragff, genePrefix = 'dra:')

## get dra genomic locus through FTP URL
draLocs <- GetLocsfKEGGSpe('dra')

## two locus names for aac (Alicyclobacillus acidocaldarius subsp. acidocaldarius DSM 446)
gzPath <- system.file("extdata", "aac.gff.gz", package = "ProGenome")
aacgff <- read.gff(gzPath, isurl = FALSE, isgz = TRUE)
locusList <- GetLocsfgff(aacgff, genePrefix = 'aac:')

## Not run: 
hxaLocs <- GetLocsfKEGGSpe('hxa')
csuLocs <- GetLocsfKEGGSpe('csu')
## End(Not run)
## Not run: 
download.SpeAnno('eco', 'gff', 'tmpEco')

## End(Not run)

YulongNiu/ProGenome documentation built on May 10, 2019, 1:13 a.m.