fasterqDump: Download or convert fastq data from NCBI Sequence Read...

Description Usage Arguments Details Value See Also Examples

View source: R/sratoolkit_functions.R

Description

'fasterqDump()' uses the SRAtoolkit command-line function 'fasterq-dump' to download fastq files from all samples returned by a queryMetadata query of GEOME, when one of the entities queried was 'fastqMetadata'

Usage

1
2
3
fasterqDump(queryMetadata_object, sratoolkitPath = "",
  outputDirectory = "./", arguments = "-p", filenames = "accessions",
  source = "sra", cleanup = FALSE, fasterqDumpHelp = FALSE)

Arguments

queryMetadata_object

A list object returned from 'queryMetadata' where one of the entities queried was 'fastqMetadata'.

sratoolkitPath

String. A path to a local copy of sratoolkit. Only necessary if sratoolkit is not on your $PATH. Assumes executables are inside 'bin'.

outputDirectory

String. A path to the directory where you would like the files to be stored.

arguments

A string variable of arguments to be passed directly to 'fasterq-dump'. Defaults to "-p" to show progress. Use fasterqDumpHelp = TRUE to see a list of arguments.

filenames

String. How would you like the downloaded fastq files to be named? "accessions" names files with SRA accession numbers "IDs" names files with their materialSampleID "locality_IDs" names files with their locality and materialSampleID.

source

String. 'fasterq-dump' can retrieve files directly from SRA, or it can convert .sra files previously downloaded with 'prefetch' that are in the current working directory. "sra" downloads from SRA "local" converts .sra files in the current working directory.

cleanup

Logical. cleanup = T will delete any intermediate .sra files.

fasterqDumpHelp

Logical. fasterqDumpHelp = T will show the help page for 'fasterq-dump' and then quit.

Details

The 'fasterq-dump' tool uses temporary files and multi-threading to speed up the extraction of FASTQ from SRA-accessions. This function works best with sratoolkit functions of version 2.9.6 or greater. SRAtoolkit functions can (ideally) be in your $PATH, or you can supply a path to them using the sratoolkitPath argument.

'fasterqDump()' downloads files to the current working directory unless a different one is assigned through outputDirectory.

Change the number of threads by adding "-e X" to arguments where X is the number of threads.

'fasterq-dump' will automatically split paired-end data into three files with:

'fasterqDump()' can then rename these files based on their materialSampleID and locality.

Note that 'fasterq-dump' will store temporary files in ~/ncbi/public/sra by default unless you pass "-t /path/to/temp/dir" to arguments. Make sure to periodically delete these temporary files.

Value

This function will not return anything within r. It simply downloads fastq files. It will print command line stdout to the console, and also provide a start and end time and amount of time elapsed during the download.

See Also

https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc to download pre-compiled executables for sratoolkit or https://github.com/ncbi/sra-tools/wiki/Building-and-Installing-from-Source> to install from source

This function will not work on Windows systems because fasterq-dump is not currently available for Windows. See fastqDump if you use Windows. See prefetch to download .sra files prior to converting them locally.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
# Run a query of GEOME first
acaoli <- queryMetadata(
    entity = "fastqMetadata", 
    query = "genus = Acanthurus AND specificEpithet = olivaceus AND _exists_:bioSample", 
    select=c("Event"))

#trim to 3 entries for expediency
acaoli$fastqMetadata<-acaoli$fastqMetadata[1:3,]
acaoli$Event<-acaoli$Event[1:3,]

# Download straight from SRA, naming files with their locality and materialSampleID
fasterqDump(queryMetadata_object = acaoli, filenames = "IDs", source = "sra")

# A generally faster option is to run prefetch first, followed by fasterqDump, with cleanup = T to 
# remove the prefetched .sra files.
prefetch(queryMetadata_object = acaoli)
fasterqDump(queryMetadata_object = acaoli, filenames = "IDs", source = "local", cleanup = T)

## End(Not run)

geomedb documentation built on July 15, 2020, 5:07 p.m.