bwaIndex | R Documentation |
This function executes the docker container bwa1 where BWA is installed. Optionally, the index can be created also for GATK bundle data genome fasta file.
bwaIndex(
group = c("sudo", "docker"),
genome.folder = getwd(),
genome.url = NULL,
gtf.url = NULL,
dbsnp.file = NULL,
g1000.file = NULL,
mode = c("General", "GATK", "miRNA", "ncRNA"),
mb.url.haripin,
mb.url.mature,
mb.species = NULL,
rc.version = NULL,
rc.species = NULL,
length = NULL
)
group |
a character string. Two options: |
genome.folder |
a character string indicating the folder where the indexed reference genome for bwa will be located |
genome.url |
a character string indicating the URL from download web page for the genome sequence of interest |
gtf.url |
a character string indicating the URL from ENSEMBL ftp for the GTF for genome of interest |
dbsnp.file |
a character string indicating the name of dbSNP vcf located in the genome folder. The dbSNP vcf, dbsnp_138.b37.vcf.gz and dbsnp_138.hg19.vcf.idx.gz, can be downloaded from ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37 |
g1000.file |
a character string indicating the name of 1000 genome vcf located in the genome folder. The 1000 genomes vcf, Mills_and_1000G_gold_standard.indels.b37.vcf.gz and Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.idx.gz, can be downloaded from ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/ |
mode |
a character string indicating the required type of analysis. Compatible analyses mode are "General", "GATK", "miRNA", and "ncRNA". In "General" mode the url of any online fasta file ("genome.url" argument) can be provided and indexed, only canonical cromosopmes are kept see id.fa after end of indexing. In the GATK analysis mode, the list of variants from dbsnp ("dbsnp.file" argument) and g1000 ("dbsnp.file" argument) are required in addition to the url of the genome fasta ("genome.url" argument). In "miRNA" analysis mode, the version ("mb.version" argument) and species prefix ("mb.species" argument) of miRBase are required. In "ncRNA" analysis mode, the version ("rc.version" argument) and species prefix ("rc.species" argument) of RNA Central are required. This mode require also a desidered maximum length of the studied RNA annotations ("length" argument). |
mb.url.haripin |
character string indicating the link to the hairpin miRNA sequences miRBase database. Visit http://www.mirbase.org to select the proper version number. |
mb.url.mature |
a character string indicating the link to the mature miRNA sequences from miRBase database. Visit http://www.mirbase.org to select the proper version number. |
mb.species |
a character string indicating the name of a species annotated in miRBase (e.g. "hsa" for human miRNAs). Please refer to http://www.mirbase.org/help/genome_summary.shtml to proper species name. |
rc.version |
a character string indicating the required version of RNA Central database. Visit ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/ to select the proper version number. |
rc.species |
a character string indicating the name of a species annotated in RNA Central (e.g. "Homo sapiens" for human ncRNAs). Please refer to NCBI taxonomy annotations at https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi to proper species name. |
length |
an integer corresponding on the length threshold selected to define the ncRNA reference from RNA Central. |
The indexed bwa reference sequence
Giulio Ferrero
## Not run:
#running generic bwa index
bwaIndex(group="docker", genome.folder="/data/genomes/mm10bwa", genome.url="ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gz", mode="General")
#running bwa index for gatk
bwaIndex(group="docker", genome.folder="/data/genomes", genome.url="http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz", dbsnp.file="dbsnp_138.hg19.vcf.gz", g1000.file="Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.gz", mode="GATK")
#running bwa index for miRNA analysis
bwaIndex(group="docker", genome.folder="/data/genomes", mb.url.mature=https://www.mirbase.org/download/mature.fa, mb.url.hairpin=https://www.mirbase.org/download/hairpin.fa, mb.species="hsa", mode="miRNA")
#running bwa index for ncRNA analysis
bwaIndex(group="docker", genome.folder="/data/genomes/hg19_bwa", rc.version="9", rc.species="Homo sapiens", length=80, mode="ncRNA")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.