knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
library(brentlabRnaSeqTools) library(tidyverse) # only need dplyr and readr
See the create_set vignette/article for comments on the DB_USERNAME and DB_PASSWORD lines below
metadata_df = getMetadata(database_info$kn99_host, database_info$kn99_db_name, Sys.getenv("DB_USERNAME"), Sys.getenv("DB_PASSWORD"))
There are some experiment sets which have functions included to subset the metadata, for example createNinetyMinuteInductionSet()
. But, in general, expect to have to filter the metadata yourself. This is a good resource dplyr as is R for datascience
subset_metadata = metadata_df %>% filter(experimentDesign=="Timecourse", genotype1=="CNAG_00000", timePoint %in% c(0, 1440))
Note in the filter above that a better method for filtering for "wildtype" strains is by the strain number. There are clinical strains in the database which also have genotype1 == "CNAG_00000".
I do this from my local computer, so if I want to check if the files exist, I mount the /lts directory and then change the sequence_dir_prefix
below to the path to lts_sequence
from my local mount point. I then change the sequence_dir_prefix to the correct one for the cluster and print that sample_sheet out for processing.
sequence_dir_prefix = "/scratch/mblab/$USER/scratch_sequence/fastq_set" sample_sheet = createNfCorePipelineSampleSheet( subset_metadata, "fastqFileNumber", sequence_dir_prefix, check_files_flag = FALSE )
(repeated in "running the pipeline")
This is a frustrating step, and one that I haven't re-written a function for yet in the brentlabRnaSeqTools package. Right now, this is how I do it. If you are not a computational person in the lab, please just ask one of us to move the files for you
awk -F"," '{print $2}' sample_sheet.csv > fastq_lookup.txt mkdir scratch_sequence/fastq_set cat fastq_lookup.txt | while read line; do rsync -aHvR $line scratch_sequence/fastq_set/; done # and then I move the run directories back up to the level that I want them -- please just ask if this doesn't make sense. Eventually there will be a function in the brentlabRnaSeqTools. If you have a better bash method, please tell me.
Inspect and write out the sample sheet
output_path = "/path/to/sample_sheet.csv" write_csv(sample_sheet, output_path)
See the article/vignette "Running the nextflow pipeline"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.