Reproducibility with seeker

knitr::opts_chunk$set(collapse = TRUE, comment = '#>')

Using the seeker package together with docker, it's easy to make fetching and processing of sequencing and microarray data completely reproducible. First pull the latest version of the socker image, which has seeker and its dependencies already installed.

```{sh, eval = FALSE} docker pull ghcr.io/hugheylab/socker

## RNA-seq data

The `seeker` package includes an example yaml file, R script, and shell script for fetching and processing a subset of an RNA-seq dataset. Here we'll download the files from GitHub to avoid having to install the package locally:

```r
urlBase = 'https://raw.githubusercontent.com/hugheylab/seeker/master/inst/extdata/'
for (filename in c('PRJNA600892.yml', 'run_seeker.R', 'run_seeker.sh')) {
  download.file(paste0(urlBase, filename), filename)}

PRJNA600892.yml:


run_seeker.R:


run_seeker.sh:


Now simply run the shell script:

```{sh, eval = FALSE} sh run_seeker.sh

The output will appear in your working directory. You can follow `seeker()`'s progress using the log file. To process a different dataset, modify the yaml file and shell script accordingly. Beware this example uses "salmon_partial_sa_index" from refgenie to minimize computational requirements; for actual use we recommend "salmon_sa_index".

## Microarray data

The `seeker` package also includes an example yaml file, R script, and shell script for fetching and processing a microarray dataset. Download the files to your working directory:

```r
urlBase = 'https://raw.githubusercontent.com/hugheylab/seeker/master/inst/extdata/'
for (filename in c('GSE25585.yml', 'run_seeker_array.R', 'run_seeker_array.sh')) {
  download.file(paste0(urlBase, filename), filename)}

GSE25585.yml:


run_seeker_array.R:


run_seeker_array.sh:


Now simply run the shell script: {sh, eval = FALSE} sh run_seeker_array.sh

The output will appear in your working directory. You can follow seekerArray()'s progress using the log file. To process a different dataset, modify the yaml file and shell script accordingly.



Try the seeker package in your browser

Any scripts or data that you put into this service are public.

seeker documentation built on Sept. 11, 2024, 7:54 p.m.