View source: R/sc_atac_pipeline.R
sc_atac_pipeline | R Documentation |
A convenient function for running the entire pipeline
sc_atac_pipeline(
r1,
r2,
bc_file,
valid_barcode_file = "",
id1_st = -0,
id1_len = 16,
id2_st = 0,
id2_len = 16,
rmN = TRUE,
rmlow = TRUE,
organism = NULL,
reference = NULL,
feature_type = NULL,
remove_duplicates = FALSE,
samtools_path = NULL,
genome_size = NULL,
bin_size = NULL,
yieldsize = 1e+06,
exclude_regions = TRUE,
excluded_regions_filename = NULL,
fix_chr = "none",
lower = NULL,
cell_calling = "filter",
promoters_file = NULL,
tss_file = NULL,
enhs_file = NULL,
gene_anno_file = NULL,
min_uniq_frags = 3000,
max_uniq_frags = 50000,
min_frac_peak = 0.3,
min_frac_tss = 0,
min_frac_enhancer = 0,
min_frac_promoter = 0.1,
max_frac_mito = 0.15,
report = TRUE,
nthreads = 12,
output_folder = NULL
)
r1 |
The first read fastq file |
r2 |
The second read fastq file |
bc_file |
the barcode information, can be either in a |
valid_barcode_file |
optional file path of the valid (expected) barcode sequences to be found in the bc_file (.txt, can be txt.gz).
Must contain one barcode per line on the second column separated by a comma (default ="").
If given, each barcode from bc_file is matched against the barcode of
best fit (allowing a hamming distance of 1). If a FASTQ |
id1_st |
barcode start position (0-indexed) for read 1, which is an extra parameter that is needed if the
|
id1_len |
barcode length for read 1, which is an extra parameter that is needed if the
|
id2_st |
barcode start position (0-indexed) for read 2, which is an extra parameter that is needed if the
|
id2_len |
barcode length for read 2, which is an extra parameter that is needed if the
|
rmN |
ogical, whether to remove reads that contains N in UMI or cell barcode. |
rmlow |
logical, whether to remove reads that have low quality barcode sequences. |
organism |
The name of the organism e.g. hg38 |
reference |
The reference genome file |
feature_type |
The feature type (either 'genome_bin' or 'peak') |
remove_duplicates |
Whether or not to remove duplicates (samtools is required) |
samtools_path |
A custom path of samtools to use for duplicate removal |
genome_size |
The size of the genome (used for the |
bin_size |
The size of the bins for feature counting with the 'genome_bin' feature type |
yieldsize |
The number of reads to read in for feature counting |
exclude_regions |
Whether or not the regions should be excluded |
excluded_regions_filename |
The filename of the file containing the regions to be excluded |
fix_chr |
Specify 'none', 'exclude_regions', 'feature' or 'both' to prepend the string "chr" to the start of the associated file |
lower |
the lower threshold for the data if using the |
cell_calling |
The desired cell calling method either |
promoters_file |
The path of the promoter annotation file (if the specified organism isn't recognised) |
tss_file |
The path of the tss annotation file (if the specified organism isn't recognised) |
enhs_file |
The path of the enhs annotation file (if the specified organism isn't recognised) |
gene_anno_file |
The path of the gene annotation file (gtf or gff3 format) |
min_uniq_frags |
The minimum number of required unique fragments required for a cell (used for |
max_uniq_frags |
The maximum number of required unique fragments required for a cell (used for |
min_frac_peak |
The minimum proportion of fragments in a cell to overlap with a peak (used for |
min_frac_tss |
The minimum proportion of fragments in a cell to overlap with a tss (used for |
min_frac_enhancer |
The minimum proportion of fragments in a cell to overlap with a enhancer sequence (used for |
min_frac_promoter |
The minimum proportion of fragments in a cell to overlap with a promoter sequence (used for |
max_frac_mito |
The maximum proportion of fragments in a cell that are mitochondrial (used for |
report |
Whether or not a HTML report should be produced |
nthreads |
The number of threads to use for alignment (sc_align) and demultiplexing (sc_atac_bam_tagging) |
output_folder |
The path of the output folder |
None (invisible 'NULL')
data.folder <- system.file("extdata", package = "scPipe", mustWork = TRUE)
r1 <- file.path(data.folder, "small_chr21_R1.fastq.gz")
r2 <- file.path(data.folder, "small_chr21_R3.fastq.gz")
# Using a barcode fastq file:
# barcodes in fastq format
barcode_fastq <- file.path(data.folder, "small_chr21_R2.fastq.gz")
## Not run:
sc_atac_pipeline(
r1 = r1,
r2 = r2,
bc_file = barcode_fastq
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.