View source: R/contactsUMI4C.R
contactsUMI4C | R Documentation |
Using demultiplexed FastQ files as input, performs all necessary steps to end up with a tsv file summarizing the restriction enzyme fragments and the number of UMIs supporting that specific contact with the viewpoint (bait) of interest.
contactsUMI4C(
fastq_dir,
wk_dir,
file_pattern = NULL,
bait_seq,
bait_pad,
res_enz,
cut_pos,
digested_genome,
bowtie_index,
threads = 1,
numb_reads = 1e+09,
rm_tmp = TRUE,
min_flen = 20,
filter_bp = 1e+07,
ref_gen,
sel_seqname = NULL
)
fastq_dir |
Path of the directory containing the FastQ files (compressed or uncompressed). |
wk_dir |
Working directory where to save the outputs generated by the UMI-4c analysis. |
file_pattern |
Character that can be used to filter the files you want
to analyze in the |
bait_seq |
Character containing the bait primer sequence. |
bait_pad |
Character containing the pad sequence (sequence between the bait primer and the restriction enzyme sequence). |
res_enz |
Character containing the restriction enzyme sequence. |
cut_pos |
Numeric indicating the nucleotide position where restriction enzyme cuts (zero-based) (for example, for DpnII is 0). |
digested_genome |
Path for the digested genome file generated using the
|
bowtie_index |
Path and prefix of the bowtie index to use for the alignment. |
threads |
Number of threads to use in the analysis. Default=1. |
numb_reads |
Number of lines from the FastQ file to load in each loop. If having memory size problems, change it to a smaller number. Default=1e9. |
rm_tmp |
Logical indicating whether to remove temporary files (sam and intermediate bams). TRUE or FALSE. Default=TRUE. |
min_flen |
Minimal fragment length to use for selecting the fragments. Default=20 |
filter_bp |
Integer indicating the bp upstream and downstream of the viewpoint to select for further analysis. Default=10e6 |
ref_gen |
A BSgenome object of the reference genome. |
sel_seqname |
A character with the chromosome name to focus the search for the viewpoint sequence. |
This function is a combination of calls to other functions that perform the necessary steps for processing UMI-4C data.
if (interactive()) {
path <- downloadUMI4CexampleData()
hg19_dpnii <- digestGenome(
cut_pos = 0,
res_enz = "GATC",
name_RE = "DpnII",
ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19,
out_path = file.path(path, "digested_genome")
)
raw_dir <- file.path(path, "CIITA", "fastq")
contactsUMI4C(
fastq_dir = raw_dir,
wk_dir = file.path(path, "CIITA"),
bait_seq = "GGACAAGCTCCCTGCAACTCA",
bait_pad = "GGACTTGCA",
res_enz = "GATC",
cut_pos = 0,
digested_genome = hg19_dpnii,
bowtie_index = file.path(path, "ref_genome", "ucsc.hg19.chr16"),
threads = 1,
numb_reads = 1e9,
ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19,
sel_seqname = "chr16"
)
unlink(path, recursive=TRUE)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.