validate_velocity_input: Validate input to get_velocity_files

View source: R/velocity.R

validate_velocity_inputR Documentation

Validate input to get_velocity_files


Validate input to get_velocity_files





Length of the biological read. For instance, 10xv1: 98 nt, 10xv2: 98 nt, 10xv3: 91 nt, Drop-seq: 50 nt. If in doubt check read length in a fastq file for biological reads with the bash commands: If the fastq file is gzipped, then do ⁠zcat your_file.fastq.gz | head⁠ on Linux. If on Mac, then zcat < your_file.fastq.gz | head. Then you will see lines with nucleotide bases. Copy one of those lines and determine its length with str_length in R or ⁠echo -n <the sequence> | wc -c⁠ in bash. Which file corresponds to biological reads depends on the particular technology.


Either a BSgenome or a XStringSet object of genomic sequences, where the intronic sequences will be extracted from. Use genomeStyles to check which styles are supported for your organism of interest; supported styles can be interconverted. If the style in your genome or annotation is not supported, then the style of chromosome names in the genome and annotation should be manually set to be consistent.


A XStringSet, a path to a fasta file (can be gzipped) of the transcriptome which contains sequences of spliced transcripts, or NULL. The transcriptome here will be concatenated with the intronic sequences to give one fasta file. When NULL, the transriptome sequences will be extracted from the genome given the gene annotation, so it will be guaranteed that transcript IDs in the transcriptome and in the annotation match. Otherwise, the type of transcript ID in the transcriptome must match that in the gene annotation supplied via argument X.


Directory to save the outputs written to disk. If this directory does not exist, then it will be created. Defaults to the current working directory.


Logical, whether to compress the output fasta file. If TRUE, then the fasta file will be gzipped.


Maximum number of letters per line of sequence in the output fasta file. Must be an integer.


Character, indicating how exonic sequences should be included in the kallisto index. Must be one of the following:


The full cDNA sequences, which include the full exonic sequences, will be used. This is the default.


Only the exon-exon junctions, with L-1 bases on each side of the junctions, will be used.


Will throw error if validation fails. Returns a named list whose first element is the normalized path to output directory, and whose second element is the normalized path to the transcriptome file if specified.

lambdamoses/BUStoolsR documentation built on Jan. 31, 2024, 5:11 a.m.