fastqDump: Download or convert fastq data from NCBI Sequence Read...

Description Usage Arguments Details Value See Also Examples

View source: R/sratoolkit_functions.R

Description

'fastqDump()' uses the SRAtoolkit command-line function 'fastq-dump' to download fastq files from all samples returned by a queryMetadata query of GEOME, when one of the entities queried was 'fastqMetadata'

Usage

1
2
3
4
fastqDump(queryMetadata_object, sratoolkitPath = "",
  outputDirectory = ".", arguments = "-v --split-3",
  filenames = "accessions", source = "sra", cleanup = FALSE,
  fastqDumpHelp = FALSE)

Arguments

queryMetadata_object

A list object returned from 'queryMetadata' where one of the entities queried was 'fastqMetadata'.

sratoolkitPath

String. A path to a local copy of sratoolkit. Only necessary if sratoolkit is not on your $PATH. Assumes executables are inside 'bin'.

outputDirectory

String. A path to the directory where you would like the files to be stored.

arguments

A string variable of arguments to be passed directly to 'fastq-dump'. Defaults to "-v –split 3" to show progress and split paired-end data. Use fastqDumpHelp = TRUE to see a list of arguments.

filenames

String. How would you like the downloaded fastq files to be named? "accessions" names files with SRA accession numbers "IDs" names files with their materialSampleID "locality_IDs" names files with their locality and materialSampleID.

source

String. 'fastq-dump' can retrieve files directly from SRA, or it can convert .sra files previously downloaded with 'prefetch' that are in the current working directory. "sra" downloads from SRA "local" converts .sra files in the current working directory.

cleanup

Logical. cleanup = T will delete any intermediate .sra files.

fastqDumpHelp

Logical. fastqDumpHelp = T will show the help page for 'fastq-dump' and then quit.

Details

This function works best with sratoolkit functions of version 2.9.6 or greater. SRAtoolkit functions can (ideally) be in your $PATH, or you can supply a path to them using the sratoolkitPath argument.

'fastqDump()' downloads files to the current working directory unless a different one is assigned through outputDirectory.

'fastq-dump' will automatically split paired-end data into three files with:

Value

This function will not return anything within r. It simply downloads fastq files. It will print command line stdout to the console, and also provide a start and end time and amount of time elapsed during the download.

See Also

https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc to download pre-compiled executables for sratoolkit or https://github.com/ncbi/sra-tools/wiki/Building-and-Installing-from-Source> to install from source

See prefetch to download .sra files prior to converting them locally. This two step process works faster than just using 'fastqDump()'. See fasterqDump for a faster, multithreaded version of 'fastqDump()' that does not work on Windows.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
# Run a query of GEOME first
acaoli <- queryMetadata(
    entity = "fastqMetadata", 
    query = "genus = Acanthurus AND specificEpithet = olivaceus AND _exists_:bioSample",
    select=c("Event"))

#trim to 3 entries for expediency
acaoli$fastqMetadata<-acaoli$fastqMetadata[1:3,]
acaoli$Event<-acaoli$Event[1:3,]

# Download straight from SRA, naming files with their locality and materialSampleID
fastqDump(queryMetadata_object = acaoli, filenames = "locality_IDs", source = "sra")

# A generally faster option is to run prefetch first, followed by fastqDump, with cleanup = T to 
# remove the prefetched .sra files.
prefetch(queryMetadata_object = acaoli)
fastqDump(queryMetadata_object = acaoli, filenames = "IDs", source = "local", cleanup = T)

## End(Not run)

geomedb documentation built on July 15, 2020, 5:07 p.m.