rawAlignment: rawAlignment allows downloading and processing the fastq...

View source: R/rawAlignment.R

rawAlignmentR Documentation

rawAlignment allows downloading and processing the fastq samples in a CSV file.

Description

This function allows downloading and processing the fastq samples in a CSV file. Also, samples can be aligned by using hisat2. Finally, the function can downloads the reference files required: FASTA Reference Genome and GTF file.

Usage

rawAlignment(
  data,
  downloadRef = FALSE,
  downloadSamples = FALSE,
  createIndex = TRUE,
  BAMfiles = TRUE,
  SAMfiles = TRUE,
  countFiles = TRUE,
  referenceGenome = 38,
  customFA = "",
  customGTF = "",
  fromGDC = FALSE,
  tokenPath = "",
  manifestPath = "",
  hisatParameters = "-p 8 --dta-cufflinks"
)

Arguments

data

The ID of the variable which contains the samples. Our recommendation is to load this variable from a CSV file.

downloadRef

A logical parameter that represents if the reference files will be downloaded or not.

downloadSamples

A logical parameter that represents if the samples of the CSV file will be downloaded or not.

createIndex

A logical parameter that represents if the index of the aligner would be created or not.

BAMfiles

A logical parameter that represents if the you want the BAM files or not.

SAMfiles

A logical parameter that represents if the you want the SAM files or not.

countFiles

A logical parameter that represents if the you want the Count files or not.

referenceGenome

This parameter allows choosing the reference genome that will be used for the alignment. The options are 37,38 or custom. The two first are human genomes, but with the third option you can choose any genome stored in the computer.

customFA

The path to the custom FASTA file of the reference genome.

customGTF

The path to the custom GTF file.

fromGDC

A logical parameter that allows processing BAM files from GDC portal by using the custom reference genome from GDC.

tokenPath

The path to the GDC portal user token. It is required to downloads the controlled BAM files.

manifestPath

The path to the manifest with the information required to downloads the controlled BAM files selected in GDC Portal.

hisatParameters

Parameter that allow to modify the default configuration for the Hisat2 aligner.

Value

Nothing to return.

Examples

# Due to the high computational cost, we strongly recommend it to see the offical documentation and the complete example included in this package:

dir <- system.file("extdata", package="KnowSeq")

#Using read.csv for NCBI/GEO files (read.csv2 for ArrayExpress files)
GSE74251csv <- read.csv(paste(dir,"/GSE74251.csv",sep = ""))

## Not run: rawAlignment(GSE74251csv,downloadRef=FALSE,downloadSamples=FALSE, createIndex = TRUE, BAMfiles = TRUE, SAMfiles = TRUE, countFiles = TRUE, referenceGenome = 38, customFA = "", customGTF = "", fromGDC = FALSE, tokenPath = "", manifestPath = "")

CasedUgr/KnowSeq documentation built on Aug. 16, 2022, 6:19 a.m.