starchipCircle: Running starchip to detect circular RNAs on paired-end...

View source: R/starchipCircle.R

starchipCircleR Documentation

Running starchip to detect circular RNAs on paired-end sequences

Description

This function execute starchip on a set of folders containing the output of starChimeric. It requires a specific bed generated with starChipIndex in the genome folder used by starChimeric

Usage

starchipCircle(
  group = c("sudo", "docker"),
  scratch.folder,
  genome.folder,
  samples.folder,
  reads.cutoff,
  min.subject.limit,
  threads,
  do.splice = c(TRUE, FALSE),
  cpm.cutoff = 0,
  subjectCPM.cutoff = 0,
  annotation = c(TRUE, FALSE)
)

Arguments

group

a character string. Two options: sudo or docker, depending to which group the user belongs

scratch.folder

a character string indicating the scratch folder where docker container will be mounted

genome.folder

a character string indicating the folder where the indexed reference genome for STAR is located.

samples.folder

the folder where are located all the folders of the samples processed with starChimeric

reads.cutoff

Integer. Minimum number of reads crossing the circRNA backsplice required.

min.subject.limit

Integer. Minimum number of individuals with readsCutoff reads required to carry forward a circRNA for analysis

threads

Integer. Number of threads to use

do.splice

true false. The splices within the circRNA be detected and reported. Linear splices are searched within each circRNA in each individual. Any linear splice with >= 60% of the read count of the cRNA is considered a splice within the circRNA. Two files are then created, .consensus with most common splice pattern, and .allvariants with all reported splice patterns.

cpm.cutoff

Float. Reads counts are loaded into R and log2(CountsPerMillion) is calculated using the limma package. With cpmCutoff > 0, circRNAs with log2(CPM) below this value will be filtered from this analysis

subjectCPM.cutoff

Integer. See above. This value is the lower limit for number of individuals required to have the circRNAs expressed at a value higher than cpmCutoff.

annotation

true/false. circRNAs are provided with gene annotations

Value

1. Count matrices : raw cRNA backsplice counts: circRNA.cutoff[readthreshold]reads.[subjectthreshold]ind.countmatrix log2CPM of above: norm_log2_counts_circRNA.[readthreshold]reads.[subjectthreshold]ind.0cpm_0samples.txt Maximum Linear Splices at Circular Loci: rawdata/linear.[readthreshold]reads.[subjectthreshold]ind.sjmax 2. Info about each circRNA: Consensus Information about Internal Splicing: Circs[reads].[subjects].spliced.consensus Complete Gene Annotation: circRNA.[readthreshold]reads.[subjectthreshold]ind.annotated Consise Gene Annotation + Splice Type: circRNA.[readthreshold]reads.[subjectthreshold]ind.genes 3. Images: PCA plots: circRNA.[readthreshold]reads.[subjectthreshold]ind.0cpm_0samples_variance_PCA.pdf Heatmap: circRNA.[readthreshold]reads.[subjectthreshold]ind.heatmap.pdf

Author(s)

Raffaele Calogero, raffaele.calogero [at] unito [dot] it, Bioinformatics and Genomics unit, University of Torino Italy

Examples

## Not run: 
    #downloading fastq files
    starchipCircle(group="docker", genome.folder="/data/genomes/hg38star", scratch.folder="/data/scratch",
                       samples.folder=getwd(), reads.cutoff=1, min.subject.limit=2, threads=8,
                       do.splice = TRUE, cpm.cutoff=0, subjectCPM.cutoff=0, annotation=TRUE)

## End(Not run)


kendomaniac/docker4seq documentation built on Sept. 3, 2024, 6:42 p.m.